Page MenuHomePhabricator

scap deploy --init fails on the spare deployment server due to a global scap lock being held
Closed, ResolvedPublic

Description

When doing the scap configuration for integration/docroot (T256005) puppet fails on the spare deployment server (deploy2001.codfw.wmnet):

/usr/bin/scap deploy --init fails on deploy2001
Error: Execution of '/usr/bin/scap deploy --init' returned 70:
 14:25:38 deploy failed: <LockFailedError> Failed to acquire lock "/var/lock/scap-global-lock"; owner is "root"; reason is "Not the active deployment server, use deploy1001.eqiad.wmnet

The lock is generated by Puppet

modules/profile/manifests/mediawiki/deployment/server.pp
 if $deploy_ensure == 'present' {
     # Lock the passive servers, leave untouched the active one.
     file { '/var/lock/scap-global-lock':
         ensure  => 'present',
         owner   => 'root',
         group   => 'root',
         content => "Not the active deployment server, use ${main_deployment_server}",
     }
}

The workaround is to rm the file.

Event Timeline

hashar renamed this task from scap deploy --init fails on the sapre deployment server due to a global scap lock being held to scap deploy --init fails on the spare deployment server due to a global scap lock being held.Jul 7 2020, 3:53 PM
hashar claimed this task.

The primary server was not fully provisioned and no deployment occurred yet. The suspicion is that Puppet tried to init the repository on deploy2002 because the directory was not present (since it never got deployed). Leading to the scap global lock issue when trying to init.

I guess it is a race condition of Puppet running on the spare deployment server before Puppet has fully completed on the primary.

It eventually settles down after the first successful puppet run/deployment.