Page MenuHomePhabricator

Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release)
Open, MediumPublic

Description

Cassandra 3.11.3 has been released; We should evaluate whether it makes sense to upgrade our environment(s).

Cassandra 3.11.4 has been released; We should evaluate whether it makes sense to upgrade our environment(s).


3.11.2 clusters

  • RESTBase
  • Session storage
  • RESTBase-dev
  • RESTBase deployment-prep
  • Session storage deployment-prep

Upgrade sequence

IMPORTANT: Read this if you came here trying to understand why Puppet has been disabled!
  1. Disable Puppet to prevent package pinning from downgrading the packages
  2. Upgrade packages
  3. Perform rolling restarts
  4. Upload 3.11.4 packages to APT repository
  5. Update Puppet to pin 3.11.4 (instead of 3.11.2)
  6. Re-enable Puppet on all hosts
  7. Apply Gerrit with non-normative configuration changes
  8. Perform rolling restarts

Details

Related Gerrit Patches:
operations/puppet : productioncassandra config updates for 3.11.4 upgrade
operations/puppet : productionStop pinning the cassandra version
operations/puppet : productioncassandra: Pin Cassandra packages to version 3.11.4

Event Timeline

Eevans created this task.Jul 31 2018, 2:50 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 31 2018, 2:50 PM
Eevans triaged this task as Medium priority.Jul 31 2018, 2:50 PM
Eevans moved this task from Backlog to Next on the User-Eevans board.
Eevans updated the task description. (Show Details)Aug 2 2018, 7:01 PM
Eevans renamed this task from Test/evaluate Cassandra 3.11.3 for production upgrade to Test/evaluate Cassandra 3.11.4 for production upgrade.Mar 19 2019, 3:47 PM
Eevans updated the task description. (Show Details)

Change 540948 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/puppet@production] [WIP]: cassandra config updates for 3.11.4 upgrade

https://gerrit.wikimedia.org/r/540948

Mentioned in SAL (#wikimedia-operations) [2019-10-09T18:45:50Z] <urandom> Upgrade restbase-dev1004-{a,b} to Cassandra 3.11.4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-09T18:51:46Z] <urandom> Upgrade restbase-dev1005-{a,b} to Cassandra 3.11.4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-09T19:09:17Z] <urandom> Upgrade restbase-dev1006-{a,b} to Cassandra 3.11.4 -- T200803

Eevans updated the task description. (Show Details)Oct 9 2019, 7:42 PM

Mentioned in SAL (#wikimedia-releng) [2019-10-09T19:46:54Z] <urandom> Upgrading deployment-restbase01.deployment-prep.eqiad.wmflabs to Cassandra 3.11.4 -- T200803

Mentioned in SAL (#wikimedia-releng) [2019-10-09T19:49:24Z] <urandom> Upgrading deployment-restbase02.deployment-prep.eqiad.wmflabs to Cassandra 3.11.4 -- T200803

Eevans updated the task description. (Show Details)Oct 9 2019, 7:49 PM

Mentioned in SAL (#wikimedia-releng) [2019-10-09T19:55:56Z] <urandom> Upgrading deployment-sessionstore01.deployment-prep.eqiad.wmflabs to Cassandra 3.11.4 -- T200803

Eevans updated the task description. (Show Details)Oct 9 2019, 7:58 PM
Eevans renamed this task from Test/evaluate Cassandra 3.11.4 for production upgrade to Upgrade Cassandra 3.11.2 clusters to 3.11.4 (bugfix release).Oct 9 2019, 8:01 PM
Eevans updated the task description. (Show Details)

Applicable staging/test environments have been updated to 3.11.4 and no issues are apparent. If this continues to look OK, and there are no objections, I will upgrade the production sessionstore cluster tomorrow (technically, it is not yet in production).j

/cc @Pchelolo @mobrovac

Mentioned in SAL (#wikimedia-operations) [2019-10-10T16:01:18Z] <urandom> Upgrading sessionstore1001.eqiad.wmnet to Cassandra 3.11.4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-10T16:16:47Z] <urandom> Upgrading sessionstore1002.eqiad.wmnet to Cassandra 3.11.4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-10T16:18:09Z] <urandom> Upgrading sessionstore1003.eqiad.wmnet to Cassandra 3.11.4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-10T16:21:28Z] <urandom> Upgrading sessionstore200[1-3].codfw.wmnet to Cassandra 3.11.4 -- T200803

Eevans updated the task description. (Show Details)Oct 10 2019, 4:24 PM

The session storage cluster has been upgraded, and things look Good. If I hear no objections, I will upgrade a canary node in each datacenter of the RESTBase cluster on Monday, and plan to upgrade the remaining nodes on Tuesday if everything checks out.

If I hear no objections, I will upgrade a canary node in each datacenter of the RESTBase cluster on Monday, and plan to upgrade the remaining nodes on Tuesday if everything checks out.

That would be restbase1016 & restbase2011 specifically, and (due to the WMF holiday), s/Tuesday/Wednesday/ & s/Monday/Tuesday/

Eevans moved this task from Next to In-Progress on the User-Eevans board.Oct 15 2019, 7:30 PM

Mentioned in SAL (#wikimedia-operations) [2019-10-15T19:42:26Z] <urandom> upgrade restbase1016-a to cassandra 3.11.-4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-15T19:48:01Z] <urandom> upgrade restbase1016-b to cassandra 3.11.-4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-15T19:52:11Z] <urandom> upgrade restbase1016-c to cassandra 3.11.-4 -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-15T19:55:04Z] <urandom> upgrade restbase2011-{a,b,c} to cassandra 3.11.-4 -- T200803

Eevans updated the task description. (Show Details)Oct 16 2019, 4:01 PM

Mentioned in SAL (#wikimedia-operations) [2019-10-16T16:33:20Z] <urandom> upgrading Cassandra to 3.11.4, eqiad, rack a -- T200803

Change 543494 had a related patch set uploaded (by Eevans; owner: Eevans):
[operations/puppet@production] cassandra: Pin Cassandra packages to version 3.11.4

https://gerrit.wikimedia.org/r/543494

Eevans updated the task description. (Show Details)Oct 16 2019, 4:45 PM

Mentioned in SAL (#wikimedia-operations) [2019-10-16T18:06:28Z] <urandom> upgrading Cassandra to 3.11.4, eqiad, rack b -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-16T18:28:57Z] <urandom> upgrading Cassandra to 3.11.4, eqiad, rack d -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-16T18:46:18Z] <urandom> upgrading Cassandra to 3.11.4, codfw, rack b -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-16T19:35:13Z] <urandom> upgrading Cassandra to 3.11.4, codfw, rack c -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-16T20:00:20Z] <urandom> upgrading Cassandra to 3.11.4, codfw, rack d -- T200803

Eevans updated the task description. (Show Details)Oct 16 2019, 8:18 PM
Eevans updated the task description. (Show Details)
Eevans updated the task description. (Show Details)

Change 543494 merged by Giuseppe Lavagetto:
[operations/puppet@production] cassandra: Pin Cassandra packages to version 3.11.4

https://gerrit.wikimedia.org/r/543494

Eevans updated the task description. (Show Details)Oct 18 2019, 3:53 PM

Change 544966 had a related patch set uploaded (by Alexandros Kosiaris; owner: Alexandros Kosiaris):
[operations/puppet@production] Stop pinning the cassandra version

https://gerrit.wikimedia.org/r/544966

Change 540948 merged by CDanis:
[operations/puppet@production] cassandra config updates for 3.11.4 upgrade

https://gerrit.wikimedia.org/r/540948

Mentioned in SAL (#wikimedia-operations) [2019-10-24T16:32:57Z] <urandom> restarting cassandra, restbase1016 (canary for config changes) -- T200803

Eevans updated the task description. (Show Details)Oct 24 2019, 4:37 PM

Mentioned in SAL (#wikimedia-operations) [2019-10-24T16:39:12Z] <urandom> restarting cassandra, restbase2011 (canary for config changes) -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-24T18:25:21Z] <urandom> restbase cassandra rolling restart, rack 'a' -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-24T18:46:08Z] <urandom> restbase cassandra rolling restart, rack 'b' -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-24T19:05:59Z] <urandom> restbase cassandra rolling restart, rack 'd' -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-24T19:31:25Z] <urandom> restbase cassandra rolling restart, codfw / rack 'b' -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-24T20:33:50Z] <urandom> restbase cassandra rolling restart, codfw / rack 'c' -- T200803

Mentioned in SAL (#wikimedia-operations) [2019-10-24T21:05:46Z] <urandom> restbase cassandra rolling restart, codfw / rack 'd' -- T200803

Eevans updated the task description. (Show Details)Oct 25 2019, 12:34 AM
Eevans moved this task from Doing to Done on the Core Platform Team Workboards (Green) board.

This is now complete.