Docs: https://wikitech.wikimedia.org/wiki/GitLab/Failover
Checklist:
Preparations before downtime:
- prepare the required Puppet changes (patch)
- Prepare the required DNS changes (patch)
- apply gitlab-settings to gitlab1003 and gitlab1004 (merge request)
-
announce downtime some days ahead on ops/releng list/broadcast messagenot needed for replicas - run a failover backup on the source host one day in advance sudo /srv/gitlab-backup/gitlab-backup.sh failover - we should double check what is the purpose of this
Scheduled downtime:
- Announce downtime in #wikimedia-gitlab
- Start gitlab failover cookbook on the cumin host with cookbook sre.gitlab.failover --switch-from gitlab1004 --switch-to gitlab1003 -t T400121
- When prompted, merge the puppet change prepared above
- When prompted, merge the DNS change prepared above
- run authdns-update on the DNS master, following the DNS update instructions