We want to make the eqiad etcd the master again, also in order to perfect a switchover procedure.
At the moment, this procedure is very manual, and involves various commits to puppet and the DNS. That's almost unavoidable if we want to avoid to store data about the etcd master *inside etcd*. I'd think twice before doing that.
The switchover is tentatively scheduled for May 31st at 09:00Z.
We expect etcd to be fully available for reading during the switchover, while there will be a short period in which writes will not be accepted by either cluster issuing a "EtcdRootReadOnly" error.
The procedure will be as follows:
- Reduce the TTL of the .conftool SRV records (the ones used by confctl, and currently pointed to codfw)
- Set both etcd clusters into read-only mode (eqiad currently already is)
- Wait for replication to catch up (should be istantaneous, more or less)
- Stop replica in eqiad via puppet
- Switch the SRV records for conftool to point to the eqiad cluster
- set the replication index in codfw to the current eqiad etcd index, start replica in codfw via puppet. This procedure should be scripted.
- Set the eqiad cluster in read-write mode
Since we still don't have a generic spin-off of switchdc I will just prepare a simple list of commands for every step.