Running a data-consistency check on db2118 after upgrading to buster failed, badly: T288244#7311265.
Attempting to restore from a dump also ended badly, as the dump procedure assumes that you are not trying to restore the node that was primary at the time the dump was taken.
Procedure for attempt 2:
- Drop all wiki databases from db2118 (and centralauth)
- Re-restore the same dump as before: time recover-dump --host db2118.codfw.wmnet --user root --password REDACTED --port 3306 /srv/backups/dumps/ongoing/kormat-dump.s7.2021-08-24--00-00-02
- Stop mariadb on db2118, and make a copy of /srv/sqldata, then restart it again. This means we won't need to re-do the dump-restore step if a later step goes wrong.
- Run reset slave all on db2118
- Configure db2118 to replicate from db2121, using db2121-bin.002436 and 130479946
- Start replication, and wait at least 30mins.