Page MenuHomePhabricator

March 2023 Datacenter Switchover live test
Closed, ResolvedPublic

Description

Meta-task for live-test tracking and possible work to do afterwards.

Event Timeline

Clement_Goubert changed the task status from Open to In Progress.Feb 22 2023, 10:33 AM
Clement_Goubert triaged this task as High priority.
Clement_Goubert created this task.
Clement_Goubert moved this task from Incoming 🐫 to Doing 😎 on the serviceops board.
10:35 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
10:35 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)

Skipping 00-optional-warmup-caches as the node script is broken and the replacement python script hasn't been reviewed yet.

11:01 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
11:01 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
11:01 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks
11:01 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0)
11:02 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
11:02 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
11:03 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.02-set-readonly
11:03 <+logmsgbot> !log cgoubert@cumin1001 [DRY-RUN] MediaWiki read-only period starts at: 2023-02-22 11:03:19.149671
11:03 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0)
11:03 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly
11:04 <+logmsgbot> !log cgoubert@cumin1001 END (FAIL) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=99)
spicerack.mysql_legacy.MysqlLegacyError: Unable to get heartbeat from master db1118.eqiad.wmnet for section s1
11:13 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki
11:13 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0)
11:13 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite
11:13 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0)
11:13 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
11:13 <+logmsgbot> !log cgoubert@cumin1001 [DRY-RUN] MediaWiki read-only period ends at: 2023-02-22 11:13:51.466468
11:13 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
11:14 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners
11:14 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners (exit_code=0)
11:14 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
11:16:26     +logmsgbot │ !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
11:18 <+logmsgbot> !log cgoubert@cumin1001 START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters
11:24 <+logmsgbot> !log cgoubert@cumin1001 END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0)

All code paths exercised and fixes applied and tested. Resolving.