db1083, acting as s1 primary master is on the list of hosts that might have a BBU crash anytime (T258386), also we need to update the kernel (T273280)
We need to promote db1163 instead as a primary master.
When: 28th April at 05:00 AM UTC
Checklist:
[] Double check db1163 has `report_host` enabled T271106
[] Create a task to communicate the chosen date and send an announcement to the community: T279505
NEW master: db1163
OLD master: db1083
[] Check configuration differences between new and old master
` pt-config-diff h=db1083.eqiad.wmnet,F=/root/.my.cnf h=db1163.eqiad.wmnet,F=/root/.my.cnf `
[] Silence alerts on all hosts
[] Set NEW master with weight 0 s1
`dbctl instance db1163 edit`
`dbctl config commit -m "Set db1163 with weight 0 T278214"`
[] Topology changes, connect everything to db1163
`db-switchover --timeout=15 --only-slave-move db1083.eqiad.wmnet db1163.eqiad.wmnet`
[] Disable puppet @db1163 and @db1083
` puppet agent --disable "switchover to db1163"`
[] Merge gerrit puppet change to promote db1163: TODO
**Failover:**
[] Start the failover
`!log Starting s1 eqiad failover from db1083 to db1163 - T278214`
[] Read only on s1
`dbctl --scope eqiad section s1 ro "Maintenance till 06:15M UTC " && dbctl config commit -m "Set s1 as read-only for maintenance T278214"`
[] Check s7 is indeed on read only
[] run switchover script from cumin1001:
`db-switchover --skip-slave-move db1083 db1163 ; echo db1083; mysql.py -hdb1083 -e "show slave status\G" ; echo db1163 ; mysql.py -hdb1163 -e "show slave status\G"`
[] Promote db1163 as new master and remove read-only
`dbctl --scope eqiad section s1 set-master db1163 && dbctl --scope eqiad section s1 rw && dbctl config commit -m "Promote db1163 to s1 master and remove read-only from s1 T278214"`
[] Restart puppet on old and new masters (for heartbeat): db1163 and db1083
` run-puppet-agent -e "switchover to db1163"`
[] Give weight to db1083 in s1
`dbctl instance db1083 edit`
**Clean up tasks:**
[] change events for query killer:
```
events_coredb_master.sql on the new master db1163
events_coredb_slave.sql on the new slave db1083
```
[] Update DNS: TODO
[] Update candidate master dbctl notes and pick new candidate master: db1083
```
dbctl instance db1163 set-candidate-master --section s1 false
dbctl instance db1083 set-candidate-master --section s1 true
```
[] Check tendril was updated
[] Check zarcillo was updated
** Had to be done manually: https://phabricator.wikimedia.org/P13956
[] Update/resolve phabricator ticket about failover