Page MenuHomePhabricator

Migrate remaining Restbase servers to Stretch
Closed, ResolvedPublic

Description

When the current round of decommisions is done, there will be four servers in codfw and two in eqiad left running Debian jessie. We should migrate those for consistency (and it would also allow us to stop maintaining our custom openjdk-8 backports for jessie (which were originally done for improved Cassandra/GC performance with Java 8 compared to 7).

  • restbase2009.codfw.wmnet
  • restbase2010.codfw.wmnet
  • restbase2011.codfw.wmnet
  • restbase2012.codfw.wmnet

Event Timeline

ArielGlenn triaged this task as Medium priority.Jun 11 2019, 7:59 AM

T208087, T223976 and T222960 are fixed. Could we get restbase2009-restbase2012, restbase1018 (and the two remaining -dev servers) migrated to Stretch in the next 1-2 months?

cassandra/restbase is the only remaining use case for our custom OpenJDK 8 backports in jessie-wikimedia and it would be fantastic not to spend more time on this when the October Java security release gets released.

T208087, T223976 and T222960 are fixed. Could we get restbase2009-restbase2012, restbase1018 (and the two remaining -dev servers) migrated to Stretch in the next 1-2 months?

cassandra/restbase is the only remaining use case for our custom OpenJDK 8 backports in jessie-wikimedia and it would be fantastic not to spend more time on this when the October Java security release gets released.

Assuming we're adhering to the process of decommissioning, re-imaging, and bootstrapping, then I suspect the only blocker to doing so would be SRE resources. :)

T208087, T223976 and T222960 are fixed. Could we get restbase2009-restbase2012, restbase1018 (and the two remaining -dev servers) migrated to Stretch in the next 1-2 months?

cassandra/restbase is the only remaining use case for our custom OpenJDK 8 backports in jessie-wikimedia and it would be fantastic not to spend more time on this when the October Java security release gets released.

Assuming we're adhering to the process of decommissioning, re-imaging, and bootstrapping, then I suspect the only blocker to doing so would be SRE resources. :)

After decommissioning the instances, I need someone to do the re-image to Stretch. Once the first successful Puppet run has occurred, I can take over again until the next host is ready for a re-image (lather, rinse, repeat, etc). @MoritzMuehlenhoff, I currently have some time to work on this, if you (or someone else on the SRE-side) does too I'm happy to get started, otherwise we can run it up the chain to have it scheduled.

Mentioned in SAL (#wikimedia-operations) [2019-09-11T16:39:28Z] <urandom> decommissioning Cassandra, restbase1018-a -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-11T18:43:47Z] <urandom> decommissioning Cassandra, restbase1018-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-11T22:30:26Z] <urandom> decommissioning Cassandra, restbase1018-c -- T224553

restbase1018 is decommissioned and ready to be reimaged.

Change 536201 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Add missing JBOD config for restbase1018

https://gerrit.wikimedia.org/r/536201

Change 536201 merged by Muehlenhoff:
[operations/puppet@production] Add missing JBOD config for restbase1018

https://gerrit.wikimedia.org/r/536201

restbase1018 is reimaged and ready for Cassandra bootstrap.

Mentioned in SAL (#wikimedia-operations) [2019-09-12T16:09:48Z] <urandom> bootstrapping Cassandra, restbase1018-a -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-12T17:46:45Z] <urandom> bootstrapping Cassandra, restbase1018-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-12T19:34:13Z] <urandom> bootstrapping Cassandra, restbase1018-c -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-12T21:00:02Z] <urandom> decommissioning Cassandra, restbase2009 -- T224553

Change 536367 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/puppet@production] hieradata: specify restbase2009 jbod devices

https://gerrit.wikimedia.org/r/536367

Mentioned in SAL (#wikimedia-operations) [2019-09-12T23:21:16Z] <urandom> decommissioning Cassandra, restbase2009-b -- T224553

restbase2009 is fully decommissioned and ready to be reimaged.

Mentioned in SAL (#wikimedia-operations) [2019-09-13T10:41:52Z] <moritzm> reimage restbase2009 to stretch T224553

Change 536367 merged by Muehlenhoff:
[operations/puppet@production] hieradata: specify restbase2009 jbod devices

https://gerrit.wikimedia.org/r/536367

restbase2009 has been reimaged and is ready to be bootstrapped in Cassandra.

Mentioned in SAL (#wikimedia-operations) [2019-09-13T15:47:28Z] <urandom> bootstrapping Cassandra, restbase2009-a -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-13T17:24:45Z] <urandom> bootstrapping Cassandra, restbase2009-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-13T19:54:02Z] <urandom> bootstrapping Cassandra, restbase2009-c -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-16T12:58:52Z] <urandom> decommissioning Cassandra, restbase2010-a -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-16T14:54:26Z] <urandom> decommissioning Cassandra, restbase2010-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-16T17:42:56Z] <urandom> decommissioning Cassandra, restbase2010-c -- T224553

restbase2010 is ready to be reimaged.

Mentioned in SAL (#wikimedia-operations) [2019-09-17T06:49:33Z] <moritzm> reimage restbase2010 to Stretch T224553

restbase2010 has been reimaged and is ready to be bootstrapped in Cassandra.

Mentioned in SAL (#wikimedia-operations) [2019-09-17T09:16:25Z] <mobrovac> bootstrap restbase2010-a - T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-17T10:58:24Z] <mobrovac> bootstrap restbase2010-b - T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-17T12:30:41Z] <mobrovac> bootstrap restbase2010-c - T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-17T15:38:05Z] <urandom> decommissioning Cassandra, restbase2011-a -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-17T17:09:05Z] <urandom> decommissioning Cassandra, restbase2011-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-17T19:05:56Z] <urandom> decommissioning Cassandra, restbase2011-c -- T224553

restbase2011 is fully decommissioned and ready to be reimaged.

Mentioned in SAL (#wikimedia-operations) [2019-09-18T06:43:51Z] <moritzm> reimaging restbase2011 to stretch T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-18T08:22:11Z] <mobrovac> bootstrap restbase2011-a -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-18T10:00:39Z] <mobrovac> bootstrap restbase2011-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-18T11:29:57Z] <mobrovac> bootstrap restbase2011-c -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-18T15:53:08Z] <urandom> decommissioning Cassandra, restbase2012-a -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-18T17:50:51Z] <urandom> decommissioning Cassandra, restbase2012-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-18T19:57:06Z] <urandom> decommissioning Cassandra, restbase2012-c -- T224553

restbase2012 is decommissioned and can be reimaged at any time.

Mentioned in SAL (#wikimedia-operations) [2019-09-19T07:01:39Z] <moritzm> reimaging restbase2012 to stretch T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-19T11:48:29Z] <mobrovac> bootstrap restbase2012-a -- T224553

restbase2012 has been reimaged and is ready to be bootstrapped in Cassandra. All jessie Cassandra instances gone!

Mentioned in SAL (#wikimedia-operations) [2019-09-19T12:39:44Z] <mobrovac@deploy1001> Started deploy [restbase/deploy@7f4b7f7]: Start using RESTBase built on Stretch - T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-19T13:01:22Z] <mobrovac@deploy1001> Finished deploy [restbase/deploy@7f4b7f7]: Start using RESTBase built on Stretch - T224553 (duration: 21m 38s)

Mentioned in SAL (#wikimedia-operations) [2019-09-19T13:12:48Z] <mobrovac> bootstrap restbase2012-b -- T224553

Mentioned in SAL (#wikimedia-operations) [2019-09-19T14:31:02Z] <mobrovac> bootstrap restbase2012-c -- T224553

mobrovac claimed this task.

This has now been completed. Thank you @MoritzMuehlenhoff for assisting and promptly re-imaging the servers!

Mentioned in SAL (#wikimedia-operations) [2019-09-19T15:26:01Z] <moritzm> repooling restbase2012 after completed Cassandra bootstrap T224553