Page MenuHomePhabricator

Locking for gitlab backups
Closed, ResolvedPublic

Description

The problem we encountered during the last gitlab upgrade was that competing backup processes interfered with each other. This is an increasingly annoying problem because backups are taking longer, and are more frequent.

The two problems to solve are:

  1. Backups/restore processes on the gitlab host are interfering with each other (e.g., while creating a backup with the cookbook, a restore happens which restarts gitlab and interrupts the backup process)
  2. rsync --delete from the primary will remove backups in progress on the replica

The simplest option is to create a lockfile on disk, which is checked for in the backup and restore scripts, and cookbooks. This doesn't really solve the rsync --delete issue though, so other options are to stop rsyncd on the replicas when a backup is running there.

Event Timeline

eoghan changed the task status from Open to In Progress.Jun 13 2023, 11:50 AM
eoghan claimed this task.
eoghan triaged this task as Medium priority.

Change 951896 had a related patch set uploaded (by EoghanGaffney; author: EoghanGaffney):

[operations/puppet@production] gitlab: Fix paths for backup common functions

https://gerrit.wikimedia.org/r/951896

Change 951896 merged by EoghanGaffney:

[operations/puppet@production] gitlab: Fix paths for backup common functions

https://gerrit.wikimedia.org/r/951896

This is deployed and seems to be working fine.

The one thing we haven't solved just yet is the fact that rsync --delete from the primary to the replicas will mess up /srv/gitlab-backup, but that can be solved later