Page MenuHomePhabricator

scap fails deploying integration/docroot
Closed, ResolvedPublic

Description

Scap is no more able to deploy integration/docroot and fails with:

ERR
target.contint2002.wikimedia.org.deploy-local: deploy-local failed: <RuntimeError> {}

And the ssh.job yields a warning:

['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'integration/docroot', '-g', 'default', 'rollback', '--refresh-config']
(ran as deploy-ci-docroot@contint2002.wikimedia.org)
returned [70]: Registering scripts in directory '/srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2/scap/scripts'
Unhandled error:
deploy-local failed: <RuntimeError> {}

Which is from a rollback. When doing a deploy, the error was:

['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'integration/docroot', '-g', 'default', 'fetch', '--refresh-config'] (ran as deploy-ci-docroot@doc2002.codfw.wmnet) returned [70]: Registering scripts in directory '/srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2/scap/scripts'
Fetch from: http://deploy1002.eqiad.wmnet/integration/docroot/.git
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'remote', 'set-url', 'origin', 'http://deploy1002.eqiad.wmnet/integration/docroot/.git'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'fetch', '--tags', '--jobs', '1', '--no-recurse-submodules'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'config', 'lfs.url', 'https://gerrit.wikimedia.org/r/integration/docroot.git/info/lfs'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Unhandled error:
{"name": "deploy-local", "msg": "%s failed: <%s> %s", "args": ["deploy-local", "FileNotFoundError", {}], "levelno": 40, "filename": "cli.py", "exc_text": null, "lineno": 419, "funcName": "_handle_exception", "created": 1717683491.0494113, "msecs": 49.41129684448242, "relativeCreated": 585.4141712188721}

And even before that:

deploy-local failed: <FileNotFoundError> [Errno 2] No such file or directory: '/srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2'

Thus I guess the local cache is borked somehow?

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Revert "git.py: Make is_dir() work for submodule directories"repos/releng/scap!342dancymaster-I3a6881810cf8c29e233fd6db04667d6363dead5dmaster
Customize query in GitLab

Event Timeline

To reproduce from the deployment server:

cd /srv/deployment/integration/docroot
scap deploy --no-log-message --verbose --limit doc1003.eqiad.wmnet

A copy paste of the warning lines excluding the ssh verbose lines:

15:01:28 ['/usr/bin/scap', 'deploy-local', '-v', '--repo', 'integration/docroot', '-g', 'default', 'fetch', '--refresh-config'] (ran as deploy-ci-docroot@doc1003.eqiad.wmnet) returned [70]: OpenSSH_7.9p1 Debian-10+deb10u4, OpenSSL 1.1.1n  15 Mar 2022
debug1: Sending command: /usr/bin/scap deploy-local -v --repo integration/docroot -g default fetch --refresh-config
Registering scripts in directory '/srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2/scap/scripts'
Fetch from: http://deploy1002.eqiad.wmnet/integration/docroot/.git
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'remote', 'set-url', 'origin', 'http://deploy1002.eqiad.wmnet/integration/docroot/.git'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'fetch', '--tags', '--jobs', '1', '--no-recurse-submodules'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'config', 'lfs.url', 'https://gerrit.wikimedia.org/r/integration/docroot.git/info/lfs'] with {'cwd': '/srv/deployment/integration/docroot-cache/cache', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Running ['git', 'rev-parse', '--is-inside-work-tree'] with {'cwd': '/srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2', 'stdout': -1, 'stderr': -1, 'text': True, 'stdin': -3}
Unhandled error:
deploy-local failed: <FileNotFoundError> {}

debug1: Exit status 70

That last directory /srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2 does not exist on doc1003. That is the sha1 of the revision I am trying to deploy ??

The commands:

In /srv/deployment/integration/docroot-cache/cache:

git rev-parse --is-inside-work-tree
git rev-parse --is-inside-work-tree
git rev-parse --is-inside-work-tree
git remote set-url origin http://deploy1002.eqiad.wmnet/integration/docroot/.git
git fetch --tags --jobs 1 --no-recurse-submodules
git config lfs.url https://gerrit.wikimedia.org/r/integration/docroot.git/info/lfs

Then in /srv/deployment/integration/docroot-cache/revs/eee90e66f005e683a407f22a30f3b624d3ca8aa2 it tries to run git rev-parse --is-inside-work-tree which fails since the directory does not exist.

That is caused by f8ace7db909382e106a52bf4473d303f946670fa by @dancy it changes the way the directory is detected:

previously the python conditionals started with os.path.isdir(git_path) which would early return when the directory does not exist (that is the case in the error above).

new code does the git rev-parse --is-inside-work-tree and catches FailedCommand but it lets FileNotFoundError to bubble up which causes this issue.

Also from the debug log above, the git command is run several times which sounds suboptimal.

@hashar Sorry about the bug. I will fix or revert right away.

dancy claimed this task.
dancy triaged this task as Unbreak Now! priority.