Page MenuHomePhabricator

Switchover s2 primary database master db1066 -> db1122 - 17th Sept @05:00 UTC
Closed, ResolvedPublic

Description

db1066 is on a6, which will be involved in the PDU maintenance T227142: a6-eqiad pdu refresh (Tuesday 10/22 @11am UTC)
We need to failover db1066 to db1122 which is on D6.

db1066 is also an old master that will be decommissioned T217396: Decommission db1061-db1073

Date&Time: 17th September at 05:00 UTC

read-only window will be required.

Details

Related Gerrit Patches:
operations/dns : masterwmnet: Change s2 CNAME to db1122
operations/puppet : productionmariadb: Promote db1122 as s2 primary master
operations/mediawiki-config : masterdb-eqiad.php: Clarify that db1122 is the candidate master
operations/puppet : productiondb1122: Change binlog format to STATEMENT

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 20 2019, 10:21 AM
Marostegui triaged this task as Normal priority.Aug 20 2019, 10:21 AM
Marostegui moved this task from Triage to Next on the DBA board.

Change 531335 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] db1122: Change binlog format to STATEMENT

https://gerrit.wikimedia.org/r/531335

Change 531335 merged by Marostegui:
[operations/puppet@production] db1122: Change binlog format to STATEMENT

https://gerrit.wikimedia.org/r/531335

Change 531336 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Clarify that db1122 is the candidate master

https://gerrit.wikimedia.org/r/531336

Change 531336 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Clarify that db1122 is the candidate master

https://gerrit.wikimedia.org/r/531336

Mentioned in SAL (#wikimedia-operations) [2019-08-21T05:04:31Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Clarify db1122 status: candidate master for s2 - T230785 (duration: 00m 55s)

Mentioned in SAL (#wikimedia-operations) [2019-08-21T05:05:46Z] <marostegui> Restart MySQL on db1122 for binlog format change - T230785

Binary log format changed on db1122, host upgraded and rebooted.

Mentioned in SAL (#wikimedia-operations) [2019-08-30T06:07:03Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1076 for upgrade - T230785', diff saved to https://phabricator.wikimedia.org/P9006 and previous config saved to /var/cache/conftool/dbconfig/20190830-060702-marostegui.json

Marostegui moved this task from Next to In progress on the DBA board.Sep 11 2019, 7:03 AM

Mentioned in SAL (#wikimedia-operations) [2019-09-11T07:06:36Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1122 to reboot for kernel upgrade T230785', diff saved to https://phabricator.wikimedia.org/P9083 and previous config saved to /var/cache/conftool/dbconfig/20190911-070635-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-09-11T07:07:18Z] <marostegui> Stop MySQL on db1122 to reboot for a kernel upgrade T230785

Change 535839 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: Promote db1122 as s2 primary master

https://gerrit.wikimedia.org/r/535839

Change 535842 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/dns@master] wmnet: Change s2 CNAME to db1122

https://gerrit.wikimedia.org/r/535842

Mentioned in SAL (#wikimedia-operations) [2019-09-17T04:11:41Z] <marostegui> Start s2 pre-switchover steps T230785

Mentioned in SAL (#wikimedia-operations) [2019-09-17T04:14:42Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Set db1122 with weight 0 and depool it from API T230785', diff saved to https://phabricator.wikimedia.org/P9111 and previous config saved to /var/cache/conftool/dbconfig/20190917-041441-marostegui.json

Change 535839 merged by Marostegui:
[operations/puppet@production] mariadb: Promote db1122 as s2 primary master

https://gerrit.wikimedia.org/r/535839

Mentioned in SAL (#wikimedia-operations) [2019-09-17T05:00:14Z] <marostegui> Starting s2 failover from db1066 to db1122 - T230785

Mentioned in SAL (#wikimedia-operations) [2019-09-17T05:00:44Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Set s2 as read-only for maintenance T230785', diff saved to https://phabricator.wikimedia.org/P9112 and previous config saved to /var/cache/conftool/dbconfig/20190917-050043-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2019-09-17T05:01:34Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db1122 to s2 master and remove read-only from s2 T230785', diff saved to https://phabricator.wikimedia.org/P9113 and previous config saved to /var/cache/conftool/dbconfig/20190917-050133-marostegui.json

Change 535842 merged by Marostegui:
[operations/dns@master] wmnet: Change s2 CNAME to db1122

https://gerrit.wikimedia.org/r/535842

This was done
read-only start: 05:00:44
read-only stop: 05:01:34

Total read-only time: 50 seconds.

Marostegui closed this task as Resolved.Sep 17 2019, 5:16 AM