The problem we encountered during the last gitlab upgrade was that competing backup processes interfered with each other. This is an increasingly annoying problem because backups are taking longer, and are more frequent.
The two problems to solve are:
- Backups/restore processes on the gitlab host are interfering with each other (e.g., while creating a backup with the cookbook, a restore happens which restarts gitlab and interrupts the backup process)
- rsync --delete from the primary will remove backups in progress on the replica
The simplest option is to create a lockfile on disk, which is checked for in the backup and restore scripts, and cookbooks. This doesn't really solve the rsync --delete issue though, so other options are to stop rsyncd on the replicas when a backup is running there.