There are some masters in eqiad that will eventually need to go away as per: T134476
The list of affected masters are:
s2 - db1018
s4 - db1040
s5 - db1049
s6 - db1050
s7 - db1041
Some of them are easier than others, not because of technical procedure (as it is the same) but for data consistency across the shard etc.
We are currently working on checksumming all the shards and fixing as many inconsistencies as possible but it is an slow process.
We'd also need to decide which server will need to be promoted to master, analyze its past history (mostly HW issues to make sure we promote a reliable one).
There is an initial draft commit about how the db-eqiad.php file would look like after all the decommissions and after moving servers around: https://gerrit.wikimedia.org/r/#/c/338996/
Proposed switchover summary:
- s2 - db1054 (looks good and checksummed)
- - s4 - db1068 (crashed once, unknown data state, alternatives? recloning it from the master or large servers to be 100% sure?) - DONE 20th April
- s5 - db1063 (looks good, but needs cloning)
- s6 - db1061 (crashed once, almost finished checksumming, but overally ok)
- s7 - db1062 (looks good, unknown data state)