db1107 needs its kernel upgraded, let's failover it to a different host and move db1107 to m3 to failover that one too (this needs to happen on m3 and m5 too, which will have a task for each of them).
Floating host: db1183
Databases on m2:
When: Thursday 5th August 2021 - at 08:00 AM UTC.
debmonitor iegreview mwaddlink mysql otrs recommendationapi scholarships sockpuppet xhgui
Failover process
OLD MASTER: db1107
NEW MASTER: db1183
- Check configuration differences between new and old master
$ pt-config-diff h=db1107.eqiad.wmnet,F=/root/.my.cnf h=db1183.eqiad.wmnet,F=/root/.my.cnf
- Silence alerts on all hosts
- Topology changes: move everything under db1183
db-switchover --timeout=15 --only-slave-move db1107.eqiad.wmnet db1183.eqiad.wmnet
- Disable puppet db1107 and db1183 puppet agent --disable "switchover to db1183"
- Merge gerrit: https://gerrit.wikimedia.org/r/709673 and https://gerrit.wikimedia.org/r/c/operations/puppet/+/710216/
- Run puppet on dbproxy1013 and dbproxy1015 and check the config
puppet agent -tv && cat /etc/haproxy/conf.d/db-master.cfg
- Start the failover
!log Failover m2 from db1107 to db1183 - T287852
root@cumin1001:~/wmfmariadbpy/wmfmariadbpy# db-switchover --skip-slave-move db1107 db1183
- Reload haproxies
dbproxy1013: systemctl reload haproxy && echo "show stat" | socat /run/haproxy/haproxy.sock stdio dbproxy1015: systemctl reload haproxy && echo "show stat" | socat /run/haproxy/haproxy.sock stdio
- kill connections on the old master (db1107)
pt-kill --print --kill --victims all --match-all F=/dev/null,S=/run/mysqld/mysql.sock
- Restart puppet on old and new masters (for heartbeat):db1183 and db1107 puppet agent --enable && puppet agent -tv
- Check services affected (otrs,debmonitor) DEBMONITOR and OTRS looking good
- Clean orchestrator heartbeat to remove the old masters' one, otherwise Orchestrator will show lag
- Close this ticket and create a ticket to failover m3