⚓ T146211 Cluster-wide major compactions: parsoid.data-parsoid table

		Status	Subtype	Assigned	Task
		Resolved		Eevans	T140008 High RESTBase storage utilization
		Resolved		Eevans	T146211 Cluster-wide major compactions: parsoid.data-parsoid table

Eevans created this task.Sep 20 2016, 8:12 PM

Eevans moved this task from Backlog to Next on the Cassandra board.

Eevans moved this task from Next to In-Progress on the Cassandra board.Sep 22 2016, 8:44 PM

Eevans updated the task description. (Show Details)

Eevans updated the task description. (Show Details)Sep 23 2016, 5:49 PM

Eevans updated the task description. (Show Details)

Eevans updated the task description. (Show Details)Sep 23 2016, 5:53 PM

Eevans updated the task description. (Show Details)Sep 26 2016, 4:40 PM

Over in T143226: Cluster-wide major compactions: parsoid.html table, I removed repaired-at timestamps (where they existed), since they split the compaction pool and prevented major compactions from being effective at reducing the droppable tombstone ratio. The same is true for wikipedia parsoid.data-parsoid (in addition to others, I suspect). These timestamps will need to be removed as well before continuing here.

Mentioned in SAL (#wikimedia-operations) [2016-10-05T15:39:44Z] <urandom> T146211: Restarting Cassandra on restbase1007-a.eqiad.wmnet to mark parsoid.data-parsoid tables unrepaired

Mentioned in SAL (#wikimedia-operations) [2016-10-05T15:48:06Z] <urandom> T146211: Restarting Cassandra on restbase1007-b.eqiad.wmnet to mark parsoid.data-parsoid tables unrepaired

Mentioned in SAL (#wikimedia-operations) [2016-10-05T15:54:15Z] <urandom> T146211: Restarting Cassandra on restbase1007-c.eqiad.wmnet to mark parsoid.data-parsoid tables unrepaired

Mentioned in SAL (#wikimedia-operations) [2016-10-05T17:31:59Z] <urandom> T146211: Performing rolling restart of restbase1010.eqiad.wmnet Cassandra instances, and marking SSTables unrepaired.

Mentioned in SAL (#wikimedia-operations) [2016-10-05T17:58:33Z] <urandom> T146211: Performing rolling restart of restbase1011.eqiad.wmnet Cassandra instances, and marking SSTables unrepaired.

Marking SSTables unrepaired with:

sudo c-foreach-restart --execute-post-shutdown "curl https://phab.wmfusercontent.org/file/data/uk5p7ehlbegir265rduu/PHID-FILE-52ba4hq35ljiymvcmxvg/Masterwork_From_Distant_Lands | bash -s {id}"

2016-10-05 17:59:48,236 INFO     [a] Disabling client ports...
2016-10-05 17:59:52,033 INFO     [a] Draining...
2016-10-05 18:01:19,562 INFO     [a] Stopping service cassandra-a
2016-10-05 18:01:22,275 INFO     [a] Executing post-shutdown command: curl https://phab.wmfusercontent.org/file/data/uk5p7ehlbegir265rduu/PHID-FILE-52ba4hq35ljiymvcmxvg/Masterwork_From_Distant_Lands | bash -s {id}
2016-10-05 18:01:54,919 INFO     [a] Found: local_group_wikipedia_T_parsoid_dataW4ULtxs1oMqJ-data-ka-13342-Data.db (repaired at 1426826797204)
2016-10-05 18:01:54,920 INFO     [a] -- Setting unrepaired...done
2016-10-05 18:01:54,920 INFO     [a] Starting service cassandra-a
2016-10-05 18:01:54,951 WARNING  [a] CQL (10.64.0.117:9042) not listening (will retry)...
2016-10-05 18:02:06,959 WARNING  [a] CQL (10.64.0.117:9042) not listening (will retry)...
2016-10-05 18:02:18,971 WARNING  [a] CQL (10.64.0.117:9042) not listening (will retry)...
2016-10-05 18:02:30,977 WARNING  [a] CQL (10.64.0.117:9042) not listening (will retry)...
2016-10-05 18:02:42,985 INFO     [a] CQL (10.64.0.117:9042) is UP
2016-10-05 18:02:43,060 INFO     [b] Disabling client ports...
2016-10-05 18:02:50,444 INFO     [b] Draining...
2016-10-05 18:04:27,981 INFO     [b] Stopping service cassandra-b
2016-10-05 18:04:30,848 INFO     [b] Executing post-shutdown command: curl https://phab.wmfusercontent.org/file/data/uk5p7ehlbegir265rduu/PHID-FILE-52ba4hq35ljiymvcmxvg/Masterwork_From_Distant_Lands | bash -s {id}
2016-10-05 18:05:06,698 INFO     [b] Found: la-21913-big-Data.db (repaired at 1426826797204)
2016-10-05 18:05:06,698 INFO     [b] -- Setting unrepaired...done
2016-10-05 18:05:06,699 INFO     [b] Starting service cassandra-b
2016-10-05 18:05:06,747 WARNING  [b] CQL (10.64.0.118:9042) not listening (will retry)...
2016-10-05 18:05:18,761 WARNING  [b] CQL (10.64.0.118:9042) not listening (will retry)...
2016-10-05 18:05:30,773 WARNING  [b] CQL (10.64.0.118:9042) not listening (will retry)...
2016-10-05 18:05:42,782 WARNING  [b] CQL (10.64.0.118:9042) not listening (will retry)...
2016-10-05 18:05:54,798 INFO     [b] CQL (10.64.0.118:9042) is UP
2016-10-05 18:05:54,800 INFO     [c] Disabling client ports...
2016-10-05 18:06:03,235 INFO     [c] Draining...
2016-10-05 18:07:32,517 INFO     [c] Stopping service cassandra-c
2016-10-05 18:07:35,346 INFO     [c] Executing post-shutdown command: curl https://phab.wmfusercontent.org/file/data/uk5p7ehlbegir265rduu/PHID-FILE-52ba4hq35ljiymvcmxvg/Masterwork_From_Distant_Lands | bash -s {id}
2016-10-05 18:08:12,688 INFO     [c] Found: local_group_wikipedia_T_parsoid_dataW4ULtxs1oMqJ-data-ka-61-Data.db (repaired at 1426826797204)
2016-10-05 18:08:12,689 INFO     [c] -- Setting unrepaired...done
2016-10-05 18:08:12,689 INFO     [c] Found: local_group_wikipedia_T_parsoid_dataW4ULtxs1oMqJ-data-ka-299-Data.db (repaired at 1426826797204)
2016-10-05 18:08:12,689 INFO     [c] -- Setting unrepaired...done
2016-10-05 18:08:12,689 INFO     [c] Starting service cassandra-c
2016-10-05 18:08:12,733 WARNING  [c] CQL (10.64.0.119:9042) not listening (will retry)...
2016-10-05 18:08:24,743 WARNING  [c] CQL (10.64.0.119:9042) not listening (will retry)...
2016-10-05 18:08:36,755 WARNING  [c] CQL (10.64.0.119:9042) not listening (will retry)...
2016-10-05 18:08:48,767 WARNING  [c] CQL (10.64.0.119:9042) not listening (will retry)...
2016-10-05 18:09:00,777 INFO     [c] CQL (10.64.0.119:9042) is UP

Mentioned in SAL (#wikimedia-operations) [2016-10-05T18:17:41Z] <urandom> T146211: Performing rolling restart of RESTBase rack 'b' Cassandra instances, and marking SSTables unrepaired.

Mentioned in SAL (#wikimedia-operations) [2016-10-05T18:46:37Z] <urandom> T146211: Performing rolling restart of RESTBase eqiad rack 'd' Cassandra instances, and marking SSTables unrepaired.

These ad hoc manual compactions were completed (more than once, in fact); Closing

Cluster-wide major compactions: parsoid.data-parsoid table
Closed, ResolvedPublic
Actions

Description

Related Objects
Search...

Event Timeline

Cluster-wide major compactions: parsoid.data-parsoid table Closed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Cluster-wide major compactions: parsoid.data-parsoid table
Closed, ResolvedPublic
Actions

Related Objects
Search...