GitLab will be switched during March 2023 Datacenter Switchover (T327920) from eqiad to codfw too (one day before the actual switchover, to not block dependencies). This task tacks the failover of the GitLab production instance eqiad (gitlab1004) to codfw (gitlab2002).
Docs: https://wikitech.wikimedia.org/wiki/GitLab/Failover
Last task: T307142#7971192 (checklist can be adapted for this years failover)
Time: 10:00 am UTC 27th of February
Checklist:
**Preparations before downtime:**
[] check `gitlab1004` and `gitlab2002` use the same ssh host keys for `ssh-gitlab` daemon
[] prepare change to set `profile::gitlab::service_name: 'gitlab.wikimedia.org'` on `gitlab2002`
[] Prepare change to point DNS entry for `gitlab.wikimedia.org` to `gitlab2002` `gitlab-replica-old.wikimedia.org` to `gitlab1004`
[] configure `gitlab2002` as `profile::gitlab::active_host`
[] apply [gitlab-settings](https://gitlab.wikimedia.org/repos/releng/gitlab-settings) to `gitlab1004` and `gitlab2002`
[] announce downtime some days ahead on ops/releng list/broadcast message
**Scheduled downtime**:
[] Announce downtime in `#wikimedia-gitlab`
[] pause all GitLab Runners
[] downtime gitlab1004 `sudo cookbook sre.hosts.downtime -r "Running failover to gitlab2002- T329931" -M 90`
[] stop puppet on `gitlab1004` with `sudo disable-puppet "Running failover to gitlab2002 - T329931"`
[] stop GitLab on `gitlab1004` with `gitlab-ctl stop nginx`
[] stop ssh-gitlab daemon on `gitlab1004` with `systemctl stop ssh-gitlab`
[] create **full** backup on `gitlab1004` with `/usr/bin/gitlab-backup create CRON=1 STRATEGY=copy GZIP_RSYNCABLE="true" GITLAB_BACKUP_MAX_CONCURRENCY="4" GITLAB_BACKUP_MAX_STORAGE_CONCURRENCY="1" `
[] sync backup, on `gitlab1004` run `/usr/bin/rsync -avp /srv/gitlab-backup/ rsync://gitlab2002.wikimedia.org/data-backup`
[] merge change to set `profile::gitlab::service_name: 'gitlab.wikimedia.org'` on `gitlab2002` and run puppet
[] trigger restore on **`gitlab2002`** run `sudo systemctl start backup-restore.service` (for logs, run `journalctl -f -u backup-restore.service`)
[] Merge change to point DNS entry for `gitlab.wikimedia.org` to `gitlab2002` `gitlab-replica-old.wikimedia.org` to `gitlab1004`
[] verify installation
[] enable puppet on `gitlab1004` with `sudo run-puppet-agent -e "Running failover to gitlab2002 - T329931"`
[] start ssh-gitlab daemon on `gitlab2002` with `systemctl stop ssh-gitlab`
[] unpause all GitLab Runners
[] announce end of downtime