db1086, acting as s7 primary master is on the list of hosts that might have a BBU crash anytime (T258386), also we need to update the kernel (T273280)
We need to promote db1136 instead as a primary master.
When: 23rd March, 06:00 AM UTC
Checklist:
[x] Double check db1136 has `report_host` enabled T271106
[x] Create a task to communicate the chosen date and send an announcement to the community: T276899
NEW master: db1086
OLD master: db1136
[x] Check configuration differences between new and old master
` pt-config-diff h=db1086.eqiad.wmnet,F=/root/.my.cnf h=db1136.eqiad.wmnet,F=/root/.my.cnf `
[x] Silence alerts on all hosts
[x] Set NEW master with weight 0 s7
`dbctl instance db1136 edit`
`dbctl config commit -m "Set db1136 with weight 0 T274336"`
[x] Topology changes, connect everything to db1136
`db-switchover --timeout=15 --only-slave-move db1086.eqiad.wmnet db1136.eqiad.wmnet`
[x] Disable puppet @db1136 and @db1086
` puppet agent --disable "switchover to db1136"`
[x] Merge gerrit puppet change to promote db1136: https://gerrit.wikimedia.org/r/c/operations/puppet/+/673195/
**Failover:**
[x] Start the failover
`!log Starting s7 eqiad failover from db1086 to db1136 - T274336`
[x] Read only on s7
`dbctl --scope eqiad section s7 ro "Maintenance till 07:15M UTC " && dbctl config commit -m "Set s7 as read-only for maintenance T274336"`
[x] Check s7 is indeed on read only
[x] run switchover script from cumin1001:
`db-switchover --skip-slave-move db1086 db1136 ; echo db1086; mysql.py -hdb1086 -e "show slave status\G" ; echo db1136 ; mysql.py -hdb1136 -e "show slave status\G"`
[x] Promote db1136 as new master and remove read-only
`dbctl --scope eqiad section s7 set-master db1136 && dbctl --scope eqiad section s7 rw && dbctl config commit -m "Promote db1136 to s7 master and remove read-only from s7 T274336"`
[x] Restart puppet on old and new masters (for heartbeat): db1136 and db1086
` run-puppet-agent -e "switchover to db1136"`
[x] Give weight to db1086 in s7
`dbctl instance db1086 edit`
**Clean up tasks:**
[x] change events for query killer:
```
events_coredb_master.sql on the new master db1136
events_coredb_slave.sql on the new slave db1086
```
[x] Update DNS: https://gerrit.wikimedia.org/r/673196
[x] Update candidate master dbctl notes and pick new candidate master: db1086
```
dbctl instance db1136 set-candidate-master --section s7 false
dbctl instance db1086 set-candidate-master --section s7 true
```
[x] Check tendril was updated
[x] Check zarcillo was updated
** Had to be done manually: https://phabricator.wikimedia.org/P13956
[] Update/resolve phabricator ticket about failover