Important Dates:
- Services: Tuesday, 18 March 2025 @ 14:00 UTC
- Traffic: Tuesday, 18 March 2025 @ 14:00 UTC
- MediaWiki: Wednesday, 19 March 2025 @ 14:00 UTC
- Deployment server: Thursday, 20 March 2025
- codfw repool: Wednesday, 26th March 2025 @ 14:00 UTC
Important Dates:
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | hnowlan | T385155 🧭 Northward Datacentre Switchover (March 2025) | |||
| Resolved | hnowlan | T385157 SRE comms for March 2025 Datacentre switchover | |||
| Resolved | Trizek-WMF | T387444 MoveComms support for March 2025 Datacentre switchover | |||
| Resolved | hnowlan | T387509 Investigate burst of DBReadOnlyError during switchover test | |||
| Open | jasmine_ | T387753 Spicerack support for mw-cron in periodic_jobs functions | |||
| Resolved | Marostegui | T388626 Prepare databases circular replication for the DC switchover | |||
| Resolved | Marostegui | T388627 Disable circular replication after DC switchover |
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.00-disable-puppet for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:02.492235
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:20.062482
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.00-optional-warmup-caches for datacenter switchover from eqiad to codfw - finished with status: FAILURE elapsed time: 0:00:12.516407
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:05:45.662908
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:15.506655
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw - [DRY-RUN] MediaWiki read-only period starts at: 2025-02-27 17:34:09.402528
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:15.227370
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:35.394616
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:21.056334
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:02.428644
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw - [DRY-RUN] MediaWiki read-only period ends at: 2025-02-27 17:36:42.297422
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:05.756227
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:32.887205
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:02:24.990807
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:40.452280
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:11:19.318929
Change #1126090 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/cookbooks@master] switchdc: stop and restart crons as part of swithover process
Change #1127067 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/dns@master] wmnet: update CNAME records for DB masters to eqiad
Change #1127069 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/dns@master] geo-maps: update map default to list eqiad first
Change #1127068 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/dns@master] wmnet: update CNAME record for maintenance host to eqiad
Change #1127072 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/mediawiki-config@master] debug: reorder debug backends for eqiad switchover
Change #1127073 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/dns@master] wmnet: point deploy server at eqiad
Change #1127074 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/puppet@production] deployment: switch deploy servers to eqiad
Change #1126090 merged by jenkins-bot:
[operations/cookbooks@master] switchdc: stop and restart crons as part of switchover process
Change #1127859 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/deployment-charts@master] mw-(web|api-ext): scale up in anticipation of switchover
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw - finished with status: FAILURE elapsed time: 0:00:15.600535
Change #1127878 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/cookbooks@master] switchdc: delete Job objects for mw-cron due to library support
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:00:15.654838
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw - finished with status: SUCCESS elapsed time: 0:02:30.880468
Change #1127859 merged by jenkins-bot:
[operations/deployment-charts@master] mw-(web|api-ext): scale up in anticipation of switchover
Change #1127878 merged by jenkins-bot:
[operations/cookbooks@master] switchdc: delete Job objects for mw-cron due to library support
hnowlan@cumin2002 - Cookbook cookbooks.sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - T385155 started.
Mentioned in SAL (#wikimedia-operations) [2025-03-18T15:05:01Z] <hnowlan@cumin2002> START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - T385155
Change #1128895 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/cookbooks@master] switchdc: clarify inputs for moving active/passive services
hnowlan@cumin2002 - Cookbook cookbooks.sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - T385155 completed.
Mentioned in SAL (#wikimedia-operations) [2025-03-18T15:34:44Z] <hnowlan@cumin2002> END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Switchover - T385155
Mentioned in SAL (#wikimedia-operations) [2025-03-19T13:48:44Z] <hnowlan@deploy2002> Locking from deployment [ALL REPOSITORIES]: Datacenter Switchover - T385155
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.00-disable-puppet for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:02.515982
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:18.881572
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:05:49.912269
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad - finished with status: FAILURE elapsed time: 0:00:10.629439
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:26.733239
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from codfw to eqiad - MediaWiki read-only period starts at: 2025-03-19 14:15:30.955779
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:18.786734
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:34.506271
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:49.388759
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:03.122358
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from codfw to eqiad - MediaWiki read-only period ends at: 2025-03-19 14:17:55.451583
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:12.437502
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:30.255885
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:02:39.089447
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:00:39.885710
Change #1127067 merged by Hnowlan:
[operations/dns@master] wmnet: update CNAME records for DB masters to eqiad
hnowlan@cumin2002 - Cookbook cookbooks.sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from codfw to eqiad - finished with status: SUCCESS elapsed time: 0:10:38.157629
Mentioned in SAL (#wikimedia-operations) [2025-03-19T14:41:24Z] <hnowlan@deploy2002> Unlocked for deployment [ALL REPOSITORIES]: Datacenter Switchover - T385155 (duration: 52m 40s)
Change #1127068 merged by Hnowlan:
[operations/dns@master] wmnet: update CNAME record for maintenance host to eqiad
Change #1127069 merged by Hnowlan:
[operations/dns@master] geo-maps: update map default to list eqiad first
Change #1127072 merged by jenkins-bot:
[operations/mediawiki-config@master] debug: reorder debug backends for eqiad switchover
Mentioned in SAL (#wikimedia-operations) [2025-03-19T15:18:04Z] <hnowlan@deploy2002> Started scap sync-world: Backport for [[gerrit:1127072|debug: reorder debug backends for eqiad switchover (T385155)]]
Mentioned in SAL (#wikimedia-operations) [2025-03-19T15:23:33Z] <hnowlan@deploy2002> hnowlan: Backport for [[gerrit:1127072|debug: reorder debug backends for eqiad switchover (T385155)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Change #1129296 had a related patch set uploaded (by Hnowlan; author: Hnowlan):
[operations/mediawiki-config@master] debug: fix config syntax
Change #1129296 merged by jenkins-bot:
[operations/mediawiki-config@master] debug: fix config syntax
Mentioned in SAL (#wikimedia-operations) [2025-03-19T15:36:28Z] <hnowlan@deploy2002> Started scap sync-world: Backport for [[gerrit:1129296|debug: fix config syntax (T385155)]]
Mentioned in SAL (#wikimedia-operations) [2025-03-19T15:41:33Z] <hnowlan@deploy2002> hnowlan: Backport for [[gerrit:1129296|debug: fix config syntax (T385155)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)
Mentioned in SAL (#wikimedia-operations) [2025-03-19T15:53:40Z] <hnowlan@deploy2002> Finished scap sync-world: Backport for [[gerrit:1129296|debug: fix config syntax (T385155)]] (duration: 17m 11s)
Change #1129945 had a related patch set uploaded (by Jasmine; author: Jasmine):
[operations/dns@master] wmnet: update deployment CNAME record to deploy1003
Mentioned in SAL (#wikimedia-operations) [2025-03-20T21:49:45Z] <kamila@deploy2002> Locking from deployment [MediaWiki]: deployment server switch -- T385155
Change #1129945 abandoned by Jasmine:
[operations/dns@master] wmnet: update deployment CNAME record to deploy1003
Reason:
Change already created
Change #1127073 merged by Kamila Součková:
[operations/dns@master] wmnet: point deploy server at eqiad
Change #1129952 had a related patch set uploaded (by Kamila Součková; author: Kamila Součková):
[operations/puppet@production] hieradata: update deployment_server to deploy1003
Change #1129952 merged by Kamila Součková:
[operations/puppet@production] hieradata: update deployment_server to deploy1003
Mentioned in SAL (#wikimedia-operations) [2025-03-20T22:58:16Z] <kamila@deploy2002> Unlocked for deployment [MediaWiki]: deployment server switch -- T385155 (duration: 68m 30s)
Mentioned in SAL (#wikimedia-operations) [2025-03-20T23:10:53Z] <kamila@deploy1003> Started scap sync-world: Test deployment to validate deployment server switchover - T385155
Mentioned in SAL (#wikimedia-operations) [2025-03-20T23:30:36Z] <kamila@deploy1003> Finished scap sync-world: Test deployment to validate deployment server switchover - T385155 (duration: 19m 42s)
Mentioned in SAL (#wikimedia-operations) [2025-03-26T14:06:24Z] <hnowlan@cumin1002> START - Cookbook sre.dns.admin DNS admin: pool site codfw [reason: Datacentre switchover repool, T385155]
Mentioned in SAL (#wikimedia-operations) [2025-03-26T14:06:40Z] <hnowlan@cumin1002> END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool site codfw [reason: Datacentre switchover repool, T385155]
hnowlan@cumin1002 - Cookbook cookbooks.sre.discovery.datacenter pool all active/active services in codfw: Datacentre switchover repool - T385155 started.
Mentioned in SAL (#wikimedia-operations) [2025-03-26T14:08:12Z] <hnowlan@cumin1002> START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: Datacentre switchover repool - T385155
hnowlan@cumin1002 - Cookbook cookbooks.sre.discovery.datacenter pool all active/active services in codfw: Datacentre switchover repool - T385155 completed.
Mentioned in SAL (#wikimedia-operations) [2025-03-26T14:30:22Z] <hnowlan@cumin1002> END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) pool all active/active services in codfw: Datacentre switchover repool - T385155
Change #1128895 merged by jenkins-bot:
[operations/cookbooks@master] switchdc: clarify inputs for moving active/passive services
Change #1127074 abandoned by Hnowlan:
[operations/puppet@production] deployment: switch deploy servers to eqiad
Reason:
Done in another patch