Page MenuHomePhabricator

SystemdUnitFailed - gitlab - backup-restore.service / zuul-scheduler on zuul1001
Closed, ResolvedPublic

Description

Common information

  • alertname: SystemdUnitFailed
  • prometheus: ops
  • severity: critical
  • source: prometheus
  • team: collaboration-services

Firing alerts



Event Timeline

Dzahn renamed this task from SystemdUnitFailed to SystemdUnitFailed - gitlab - backup-restore.service.Apr 10 2026, 2:50 PM

The backup-restore on gitlab is the expected issue after the linked security upgrade.

zuul-scheduler.service und zuul1001 taking a look now.

The original issue was FileNotFoundError: [Errno 2] No such file or directory: '/var/ssh/zuul'

That was fixed by https://gerrit.wikimedia.org/r/c/operations/puppet/+/1260847

But we are not mounting the path yet for the docker container.

Dzahn renamed this task from SystemdUnitFailed - gitlab - backup-restore.service to SystemdUnitFailed - gitlab - backup-restore.service / zuul-scheduler on zuul1001.Apr 10 2026, 4:16 PM
Dzahn claimed this task.

Change #1270103 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] zuul: mount /var/ssh/zuul for zuul-scheduler

https://gerrit.wikimedia.org/r/1270103

Change #1270103 merged by Dzahn:

[operations/puppet@production] zuul: mount /var/ssh/zuul for zuul-scheduler

https://gerrit.wikimedia.org/r/1270103

LSobanski triaged this task as Medium priority.Apr 13 2026, 3:54 PM
LSobanski moved this task from Incoming to Work in Progress on the collaboration-services board.

Created a new ed25519 key pair for new zuul to connect to Gerrit (in the future).

[Ops] [puppet-private] (3334cf48f) (dzahn) add new ed25519 keypair for new zuul to connect to gerrit (T395938)

It lives under secrets/gerrit/zuul_gerrit_ed25519(.pub). Has NOT been added on the Gerrit side yet.

Change #1270577 had a related patch set uploaded (by Dzahn; author: Dzahn):

[labs/private@master] add fake keys for new zuul to connect to gerrit

https://gerrit.wikimedia.org/r/1270577

Change #1270577 merged by Dzahn:

[labs/private@master] add fake keys for new zuul to connect to gerrit

https://gerrit.wikimedia.org/r/1270577

Change #1270580 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] zuul: make gerrit ssh key configurable in Hiera and add it

https://gerrit.wikimedia.org/r/1270580

Change #1270580 merged by Dzahn:

[operations/puppet@production] zuul: make gerrit ssh key configurable in Hiera and add it

https://gerrit.wikimedia.org/r/1270580

The previous error message is gone. The new SSH key exists. It simply has not been added on the Gerrit side, as intended. Therefore:

Apr 17 19:36:52 zuul1001 docker[2781962]: 2026-04-17 19:36:52,181 ERROR zuul.GerritConnection.ssh:   paramiko.ssh_exception.AuthenticationException: Authentication failed.