Page MenuHomePhabricator

Reshape RESTBase Cassandra clusters
Closed, ResolvedPublic

Description

As part of T179417: Migrate Parsoid from legacy to new storage, capacity will need to be moved from the legacy cluster to the new, by first decommissioning the instances on one host in each rack (6 total), re-imaging them, and bootstrapping them into the new (Cassandra 3.x) cluster.

The hosts to be decommissioned (in order):

  • 2002 (codfw, rack b)
  • 2004 (codfw, rack c)
  • 2006 (codfw, rack d)
  • 1007 (eqiad, rack a)
  • 1012 (eqiad, rack b)
  • 1014 (eqiad, rack d)

Nodes to be bootstrapped:

  • 2002
    • a
    • b
    • c
  • 2006
    • a
    • b
    • c
  • 2004
    • a
    • b
    • c
  • 1007
    • a
    • b
    • c
  • 1012
    • a
    • b
    • c
  • 1014
    • a
    • b
    • c

Post-bootstrap cleanups:

  • 2001
  • 2002
  • 2003
  • 2004
  • 2005
  • 2006
  • 1007
    • a (in-progress...)
    • b
    • c
  • 1008
    • a (in-progress...)
    • b
    • c
  • 1009
    • a
    • b (in-progress...)
    • c
  • 1010
    • a
    • b (in-progress...)
    • c
  • 1012
    • a (in-progress...)
    • b (in-progress...)
    • c
  • 1014
    • a
    • b (in-progress...)
    • c
NOTE: 2006 was intentionally reordered to happen before 2004 in order to side-step T180562: Degraded RAID on restbase2004 for now

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

I propose to convert these in pairs (4 sets of 2), with a bit of time for compaction to settle in between.

+1

Mentioned in SAL (#wikimedia-operations) [2017-11-16T17:05:29Z] <urandom> Converting 'others mobile' to size-tiered compaction (T179422)

Change 391867 merged by Filippo Giunchedi:
[operations/puppet@production] cassandra: reprovision restbase2006 with cassandra 3

https://gerrit.wikimedia.org/r/391867

Mentioned in SAL (#wikimedia-operations) [2017-11-16T17:15:48Z] <urandom> Converting 'commons mobile' to size-tiered compaction (T179422)

Mentioned in SAL (#wikimedia-operations) [2017-11-16T17:18:46Z] <urandom> Converting 'wikipedia parsoid' to size-tiered compaction (T179422)

Compaction throughput on restbase2005 (the only rack d node in the 3x cluster) has been set to 10MB/s (half of its normal setting), in anticipation of the restbase2006 boostraps.

eevans@restbase2005:~$ c-foreach-nt getcompactionthroughput
a: Current compaction throughput: 10 MB/s
b: Current compaction throughput: 10 MB/s
c: Current compaction throughput: 10 MB/s
eevans@restbase2005:~$

Mentioned in SAL (#wikimedia-operations) [2017-11-16T17:28:34Z] <urandom> Converting 'enwiki parsoid' to size-tiered compaction (T179422)

In addition to the ones converted from leveled to size-tiered yesterday, there are 8 additional mis-configured tables:

[ ... ]

Complete.

Mentioned in SAL (#wikimedia-operations) [2017-11-16T17:40:03Z] <urandom> Decommissioning Cassandra, restbase1014-b.eqiad.wmnet (T179422)

Mentioned in SAL (#wikimedia-operations) [2017-11-16T19:46:40Z] <urandom> Bootstapping Cassandra restbase2006-a (T179422)

Mentioned in SAL (#wikimedia-operations) [2017-11-17T00:23:01Z] <urandom> Decommissioning Cassandra, restbase1014-c.eqiad.wmnet (T179422)

Mentioned in SAL (#wikimedia-operations) [2017-11-17T09:20:14Z] <godog> start restbase2006-c instead, restbase2006-b failed and -c shows as "down" - T179422

Mentioned in SAL (#wikimedia-operations) [2017-11-17T15:26:32Z] <urandom> Starting restbase2006-c w/ -Dcassandra.replace_address=10.192.48.51 (T179422)

Mentioned in SAL (#wikimedia-operations) [2017-11-18T00:30:12Z] <urandom> Bootstrapping restbase2006-b (T179422)

Change 392396 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] cassandra: reprovision restbase2004 with cassandra 3

https://gerrit.wikimedia.org/r/392396

Change 392396 merged by Filippo Giunchedi:
[operations/puppet@production] cassandra: reprovision restbase2004 with cassandra 3

https://gerrit.wikimedia.org/r/392396

Mentioned in SAL (#wikimedia-operations) [2017-11-21T10:35:06Z] <godog> bootstrap cassandra restbase2004-a - T179422

Mentioned in SAL (#wikimedia-operations) [2017-11-21T22:29:06Z] <urandom> Bootstrapping Cassandra, restbase2004-b.codfw.wmnet (T179422)

Mentioned in SAL (#wikimedia-operations) [2017-11-22T14:11:21Z] <urandom> bootstrapping cassandra, restbase2004-c.codfw.wmnet - T179422

Mentioned in SAL (#wikimedia-operations) [2017-11-22T19:49:46Z] <urandom> starting cassandra cleanups, restbase-200{1,3,5}-a - T179422

Change 393550 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] cassandra: reprovision restbase1007 with cassandra 3

https://gerrit.wikimedia.org/r/393550

Change 393550 merged by Filippo Giunchedi:
[operations/puppet@production] cassandra: reprovision restbase1007 with cassandra 3

https://gerrit.wikimedia.org/r/393550

Mentioned in SAL (#wikimedia-operations) [2017-11-28T08:02:19Z] <mobrovac> bootstrap restbase1007-b - T179422

Mentioned in SAL (#wikimedia-operations) [2017-11-28T08:02:19Z] <mobrovac> bootstrap restbase1007-b - T179422

Hm, this didn't work. The node started spitting a lot of UnknownColumnFamilyExceptions so I stopped and masked it. Also, nodetool says the node fully joined, but with only a small amount of data:

UN  restbase1007-b.eqiad.wmnet  470.93 KiB  256          20.8%             73685a0b-0682-4cc2-92b5-42e4a5a05d68  a

Mentioned in SAL (#wikimedia-operations) [2017-11-28T09:19:17Z] <godog> unmask and restart restbase1007-b - T179422

Change 393807 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: disable restbase1007-c cassandra instance

https://gerrit.wikimedia.org/r/393807

Change 393807 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: disable restbase1007-c cassandra instance

https://gerrit.wikimedia.org/r/393807

Mentioned in SAL (#wikimedia-operations) [2017-11-28T17:42:08Z] <urandom> restart cassandra, restbase1007, to pickup logstash java deps - T179422

Change 393810 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] Revert "hieradata: disable restbase1007-c cassandra instance"

https://gerrit.wikimedia.org/r/393810

Mentioned in SAL (#wikimedia-operations) [2017-11-28T17:46:16Z] <urandom> decommissioning cassandra, restbase1007-b - T179422

Mentioned in SAL (#wikimedia-operations) [2017-11-28T18:28:46Z] <urandom> (re)bootstrapping cassandra, restbase1007-b - T179422

At 8:02 this morning, a bootstrap of 1007-b was started. That bootstrap completed almost immediately, without errors, but without streaming any data to speak of. The very first log message pertaining to the bootstrap provides a clue though...

WARN  [main] 2017-11-28 08:12:15,577 StorageService.java:865 - Detected previous bootstrap failure; retrying

<theory>

It would seem as though this node may have been started once before, and ran long enough to establish some state (state that was presumably corrupt). This likely happened when Puppet was ran for the first time, and attempted to start all 3 units. Since only 1 instance can bootstrap at a time, 1007-b would either have stopped after detecting another instance bootstrapping (1007-a), or was terminated by someone, but not before establishing a bootstrap session (a session which failed to complete successful).

</theory>

Enabling all 3 instances at once causes others problems as well, for example, because having some units masked (to avoid concurrent bootstraps) prevented Puppet from running successfully, the deployments of logstash-related dependencies never completed; In the future, we should enable each instance one at a time to avoid this class of problem.

1007-b has been decommissioned, its state wiped, and is currently bootstrapping.

Change 393810 merged by Filippo Giunchedi:
[operations/puppet@production] Revert "hieradata: disable restbase1007-c cassandra instance"

https://gerrit.wikimedia.org/r/393810

Change 394053 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] cassandra: reprovision restbase1012 with cassandra 3

https://gerrit.wikimedia.org/r/394053

Change 394053 merged by Filippo Giunchedi:
[operations/puppet@production] cassandra: reprovision restbase1012 with cassandra 3

https://gerrit.wikimedia.org/r/394053

Mentioned in SAL (#wikimedia-operations) [2017-11-29T17:40:15Z] <godog> bootstrapping restbase1012-a - T179422

1012-a is currently bootstrapping from 1008-{a,b,c}, but only one of the stream sessions is moving (the one from 1008-c); Exceptions seem to have terminated 2 of the 3 bootstrap stream sessions.

Mentioned in SAL (#wikimedia-operations) [2017-11-29T20:05:41Z] <urandom> restarting cassandra bootstrap of restbase1012-a (T179422)

1012-a is currently bootstrapping from 1008-{a,b,c}, but only one of the stream sessions is moving (the one from 1008-c); Exceptions seem to have terminated 2 of the 3 bootstrap stream sessions.

More on these terminated stream sessions, since the issue is obviously not transient:

The relevant exception seems to be on the receiving side (1012-a):

Screenshot-2017-11-29 Kibana.png (1×2 px, 242 KB)

This exception occurs when the stream reader can't find the schema for the table referenced (56e7fe50-d4f5-11e7-935a-5bb7090ce329 in this instance). Presumably this means the schema was not properly synced to the new node on startup.

Perhaps unrelated, but the cfId referenced above is for local_group_wikipedia_T_parsoid_dataLpBGD5XFAMFs.meta, which seems very odd since that group/naming convention should no longer be in use. And this isn't the only one, the full compliment of tables using the legacy grouping and naming convention exists:

eevans@restbase1008:~$ c-cqlsh a -e 'describe keyspaces' |grep local_
"local_group_wikinews_T_parsoid_htmliZ1mueNVmW9Ml"
"local_group_wikiversity_T_parsoid_stash_wikitext"
"local_group_wikivoyage_T_parsoid_stash_wikitext"
"local_group_wikimedia_T_parsoid_htmliZ1mueNVmW9M"
"local_group_wiktionary_T_parsoid_sectionkYE_jMlE"
"local_group_default_T_parsoid_stash_wikitext"
"local_group_wikiquote_T_parsoid_htmliZ1mueNVmW9M"
"local_group_wikisource_T_parsoid_wikitext"
"local_group_wikivoyage_T_parsoid_stash_html"
"local_group_wikimedia_T_restrictions"
"local_group_wikiquote_T_parsoid_stash_wikitext"
"local_group_globaldomain_T_mathoid_mml"
"local_group_wikisource_T_parsoid_stash_dU81yvllO"
"local_group_wikiquote_T_parsoid_wikitext"
"local_group_wikipedia_T_title__revisions"
"local_group_default_T_parsoid_dataLpBGD5XFAMFsTr"
"local_group_wikiversity_T_parsoid_sectiokYE_jMlE"
"local_group_wikiversity_T_parsoid_htmliZ1mueNVmW"
"local_group_wikiversity_T_title__revisions"
"local_group_phase0_T_title__revisions"
"local_group_default_T_parsoid_stash_html"
"local_group_wikibooks_T_parsoid_stash_wikitext"
"local_group_wikimedia_T_parsoid_wikitext"
"local_group_wikipedia_T_parsoid_wikitext"
"local_group_wikibooks_T_title__revisions"
"local_group_phase0_T_restrictions"
"local_group_wikiquote_T_title__revisions"
"local_group_wikiquote_T_parsoid_dataLpBGD5XFAMFs"
"local_group_phase0_T_parsoid_stash_dataU81yvllO3"
"local_group_wikinews_T_parsoid_stash_html"
"local_group_wikimedia_T_parsoid_dataLpBGD5XFAMFs"
"local_group_wikiquote_T_parsoid_stash_html"
"local_group_wikiversity_T_parsoid_stash_U81yvllO"
"local_group_wikivoyage_T_parsoid_htmliZ1mueNVmW9"
"local_group_wikivoyage_T_title__revisions"
"local_group_default_T_parsoid_htmliZ1mueNVmW9MlJ"
"local_group_wikiquote_T_parsoid_stash_daU81yvllO"
"local_group_wikipedia_T_parsoid_stash_html"
"local_group_wikinews_T_parsoid_stash_wikitext"
"local_group_phase0_T_parsoid_section_ofkYE_jMlE6"
"local_group_wikimedia_T_parsoid_stash_wikitext"
"local_group_wikivoyage_T_parsoid_dataLpBGD5XFAMF"
"local_group_wiktionary_T_title__revisions"
"local_group_wikiversity_T_restrictions"
"local_group_wikivoyage_T_parsoid_wikitext"
"local_group_wikisource_T_parsoid_dataLpBGD5XFAMF"
"local_group_wikiversity_T_parsoid_dataLpBGD5XFAM"
"local_group_wikibooks_T_restrictions"
"local_group_wikibooks_T_parsoid_htmliZ1mueNVmW9M"
"local_group_wikibooks_T_parsoid_section_kYE_jMlE"
"local_group_wikimedia_T_parsoid_stash_daU81yvllO"
"local_group_wikipedia_T_parsoid_htmliZ1mueNVmW9M"
"local_group_wikiquote_T_parsoid_section_kYE_jMlE"
"local_group_wiktionary_T_parsoid_stash_html"
"local_group_wikimedia_T_parsoid_stash_html"
"local_group_wikisource_T_parsoid_sectionkYE_jMlE"
"local_group_wikinews_T_title__revisions"
"local_group_wikibooks_T_parsoid_wikitext"
"local_group_wikisource_T_title__revisions"
"local_group_wiktionary_T_parsoid_htmliZ1mueNVmW9"
"local_group_phase0_T_parsoid_wikitext"
"local_group_wikinews_T_parsoid_section_kYE_jMlE6"
"local_group_default_T_parsoid_section_okYE_jMlE6"
"local_group_wiktionary_T_parsoid_dataLpBGD5XFAMF"
"local_group_wikisource_T_parsoid_stash_wikitext"
"local_group_wikimedia_T_title__revisions"
"local_group_wikinews_T_parsoid_dataLpBGD5XFAMFsT"
"local_group_wikivoyage_T_parsoid_stash_dU81yvllO"
"local_group_wikipedia_T_parsoid_dataLpBGD5XFAMFs"
"local_group_wikibooks_T_parsoid_stash_html"
"local_group_wikinews_T_parsoid_stash_daU81yvllO3"
"local_group_wikiversity_T_parsoid_wikitext"
"local_group_phase0_T_parsoid_dataLpBGD5XFAMFsTr8"
"local_group_wikivoyage_T_restrictions"
"local_group_wikinews_T_parsoid_wikitext"
"local_group_wikiversity_T_parsoid_stash_html"
"local_group_wiktionary_T_parsoid_wikitext"
"local_group_wikibooks_T_parsoid_stash_daU81yvllO"
"local_group_phase0_T_parsoid_stash_wikitext"
"local_group_wikipedia_T_parsoid_section_kYE_jMlE"
"local_group_default_T_title__revisions"
"local_group_wiktionary_T_parsoid_stash_dU81yvllO"
"local_group_wikivoyage_T_parsoid_sectionkYE_jMlE"
"local_group_default_T_restrictions"
"local_group_wiktionary_T_parsoid_stash_wikitext"
"local_group_phase0_T_parsoid_stash_html"
"local_group_wikipedia_T_restrictions"
"local_group_globaldomain_T_mathoid_svg"
"local_group_wikinews_T_restrictions"
"local_group_wikiquote_T_restrictions"
"local_group_wikipedia_T_parsoid_stash_daU81yvllO"
"local_group_wikipedia_T_parsoid_stash_wikitext"
"local_group_default_T_parsoid_stash_datU81yvllO3"
"local_group_wikisource_T_restrictions"
"local_group_default_T_parsoid_wikitext"
"local_group_wikibooks_T_parsoid_dataLpBGD5XFAMFs"
"local_group_phase0_T_parsoid_htmliZ1mueNVmW9MlJq"
"local_group_wiktionary_T_restrictions"
"local_group_wikisource_T_parsoid_stash_html"
"local_group_wikisource_T_parsoid_htmliZ1mueNVmW9"
"local_group_wikimedia_T_parsoid_section_kYE_jMlE"
eevans@restbase1008:~$

It would appear that these were created at about 10:20 this morning. While that does precede the 1012-a bootstrap (which was started at ~17:40), it almost certainly occurred while 1007-c was still bootstrapping.

Perhaps unrelated, but the cfId referenced above is for local_group_wikipedia_T_parsoid_dataLpBGD5XFAMFs.meta, which seems very odd since that group/naming convention should no longer be in use. And this isn't the only one, the full compliment of tables using the legacy grouping and naming convention exists:

[ ... ]

It would appear that these were created at about 10:20 this morning. While that does precede the 1012-a bootstrap (which was started at ~17:40), it almost certainly occurred while 1007-c was still bootstrapping.

I believe what we're going to need to do is first resolve T181689: New RESTBase Cassandra cluster has legacy tables, then restart 1012-a from scratch after clearing all of its state (and since a bootstrap has already been started, this will probably require the use of the cassandra.replace_address parameter).

Change 394343 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] cassandra: reprovision restbase1014 with cassandra 3

https://gerrit.wikimedia.org/r/394343

Mentioned in SAL (#wikimedia-operations) [2017-12-01T16:24:10Z] <urandom> starting cassandra bootstrap, restbase1012-a -- T179422

Change 394602 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/puppet@production] Enable Cassandra instance: restbase1012-b

https://gerrit.wikimedia.org/r/394602

Change 394603 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/puppet@production] Enable Cassandra instance: restbase1012-c

https://gerrit.wikimedia.org/r/394603

Change 394602 merged by Dzahn:
[operations/puppet@production] hieradata: enable Cassandra instance: restbase1012-b

https://gerrit.wikimedia.org/r/394602

Mentioned in SAL (#wikimedia-operations) [2017-12-01T23:15:21Z] <urandom> starting cassandra bootstrap, restbase1012-b - T179422

Change 394951 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: enable restbase1012-c

https://gerrit.wikimedia.org/r/394951

Change 394951 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: enable restbase1012-c

https://gerrit.wikimedia.org/r/394951

Change 394343 merged by Filippo Giunchedi:
[operations/puppet@production] cassandra: reprovision restbase1014 with cassandra 3

https://gerrit.wikimedia.org/r/394343

Change 394603 abandoned by Eevans:
hieradata: enable Cassandra instance: restbase1012-c

Reason:
Duplicate of https://gerrit.wikimedia.org/r/394951

https://gerrit.wikimedia.org/r/394603

Change 395085 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/puppet@production] hieradata: enabled restbase1014-b for bootstrap

https://gerrit.wikimedia.org/r/395085

Change 395086 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/puppet@production] hieradata: enable restbase1014-c for bootstrap

https://gerrit.wikimedia.org/r/395086

Change 395085 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: enabled restbase1014-b for bootstrap

https://gerrit.wikimedia.org/r/395085

Change 395086 merged by Dzahn:
[operations/puppet@production] hieradata: enable restbase1014-c for bootstrap

https://gerrit.wikimedia.org/r/395086

Mentioned in SAL (#wikimedia-operations) [2017-12-05T15:03:49Z] <urandom> bootstrapping cassandra, restbase1014-c - T179422

Eevans claimed this task.

Some cleanups remain, but as we are set to begin a new round of bootstraps, it makes sense to discontinue this effort until they are done; Closing this issue as complete