Error: Execution of '/usr/bin/deploy-local --repo ores/deploy -D log_json:False' returned 70: http://deployment-tin.deployment-prep.eqiad.wmflabs/ores/deploy/.git
Error: Execution of '/usr/bin/deploy-local --repo analytics/aqs/deploy -D log_json:False' returned 70
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | EddieGP | T132259 Deployment-prep hosts with puppet errors (tracking) | |||
Resolved | None | T116206 Set up AQS in Beta | |||
Resolved | Ladsgroup | T132267 deployment-((sca|aqs)01|ores-web) puppet failures due to scap3 errors |
Event Timeline
This is likely because .git/DEPLOY_HEAD does not exist for these repos on deployment-tin
aqs deployment failures
It looks like the scap puppet provider is attempting to deploy analytics/aqs/deploy from deployment-tin; however there is no /srv/deployment/analytics/aqs/deploy on deployment-tin.
Although, I believe some recently merged puppet stuffs should be creating this repo on deployment-tin. Something that needs more investigation.
ores deployment failures
I'm not clear on why ores deploy is failing. I've been working on deployment-sca01. One problem is that the puppet provider assumes that the deployed repository on the target box will be under /srv/deployment https://github.com/wikimedia/operations-puppet/blob/production/modules/scap/lib/puppet/provider/package/scap3.rb#L35
Since ores is deploying to /srv/ores the deploy may be succeeding when run as the deploy-service user
deploy-service@deployment-sca01:~$ /usr/bin/deploy-local --repo ores/deploy -D log_json:False 14:55:44 INFO - Starting new HTTP connection (1): deployment-tin.deployment-prep.eqiad.wmflabs http://deployment-tin.deployment-prep.eqiad.wmflabs/ores/deploy/.git From http://deployment-tin.deployment-prep.eqiad.wmflabs/ores/deploy/ * [new branch] master -> origin/master * [new branch] prod -> origin/prod * [new branch] scap_again -> origin/scap_again 14:55:44 Revision directory already exists (use --force to override) 14:55:44 Starting new HTTP connection (1): deployment-tin.deployment-prep.eqiad.wmflabs 14:55:44 /srv/ores/deploy-cache/revs/19beee5882382ed2ea92492e1db77b94a5bf2751 is already live (use --force to override) deploy-service@deployment-sca01:~$ echo $? 0 deploy-service@deployment-sca01:~$ ls -l /srv/ores/ total 8 lrwxrwxrwx 1 deploy-service deploy-service 58 Apr 3 03:06 deploy -> deploy-cache/revs/19beee5882382ed2ea92492e1db77b94a5bf2751 -rwxrwxr-x 1 deploy-service deploy-service 0 Apr 1 03:16 deploy.2016-04-01T03:23:00.175110 drwxrwxr-x 4 deploy-service deploy-service 4096 Apr 3 03:07 deploy-cache drwxrwxr-x 6 deploy-service deploy-service 4096 Apr 3 03:06 venv
But puppet querying to see if ores/deploy is installed will fail: https://github.com/wikimedia/operations-puppet/blob/production/modules/scap/lib/puppet/provider/package/scap3.rb#L94-L105
deploy-service@deployment-sca01:~$ git -C /srv/ores/deploy tag --points-at HEAD scap/sync/2016-04-03/0001 deploy-service@deployment-sca01:~$ git -C /srv/deployment/ores/deploy tag --points-at HEAD fatal: Cannot change to '/srv/deployment/ores/deploy': No such file or directory deploy-service@deployment-sca01:~$ echo $? 128
Given that puppet fails with an exit code of 70, it seems likely that scap itself is failing (70 is the exception exit code); however, since I'm able to run the command that puppet is nominally running as the user that is nominally running that command, I'm not sure what could be wrong here. @mmodell @dduvall —do you have any thoughts on this one?
The first thing that should likely be done for Ores is to change git_deploy_dir in the scap.cfg to /srv/deployment/ since that's what the puppet provider expects. Unclear if that could be causing this error, but this will certainly cause an error.
@thcipriani I was dealing with this issue while I was working on using scap I thought I solved it. That's what I've got so far: 70 mean unhandled error. Running it from the deploy-service user won't solver the issue, it works but not with puppet (since the puppet is actually logs in with the deploy-service user). With some modifications to the scap itself (that I told you about) I was able to get more signal from puppet runs. It's a known issue with git clone (and it fails when it tries to do the git clone). It happens because the puppet master (or you're running it with puppet agent) can't change directory, so basically if you run the puppet agent from another directory, let's say "/srv" or even simply "/" it should work. I just did that and it worked like a charm. So long story short, you should run the puppet from somewhere else than home directory of the user:
ladsgroup@deployment-sca01:/srv$ sudo puppet agent -tv Info: Retrieving pluginfacts Info: Retrieving plugin Info: Loading facts Info: Caching catalog for deployment-sca01.deployment-prep.eqiad.wmflabs Info: Applying configuration version '1460479826' Error: Could not set home on user[citoid]: Execution of '/usr/sbin/usermod -d /nonexistent citoid' returned 8: usermod: user citoid is currently used by process 455 Error: /Stage[main]/Citoid/Service::Node[citoid]/User[citoid]/home: change from /home/citoid to /nonexistent failed: Could not set home on user[citoid]: Execution of '/usr/sbin/usermod -d /nonexistent citoid' returned 8: usermod: user citoid is currently used by process 455 Notice: /Stage[main]/Citoid/Service::Node[citoid]/File[/var/log/citoid]: Dependency User[citoid] has failures: true Warning: /Stage[main]/Citoid/Service::Node[citoid]/File[/var/log/citoid]: Skipping because of failed dependencies Error: Could not set home on user[graphoid]: Execution of '/usr/sbin/usermod -d /nonexistent graphoid' returned 8: usermod: user graphoid is currently used by process 457 Error: /Stage[main]/Graphoid/Service::Node[graphoid]/User[graphoid]/home: change from /home/graphoid to /nonexistent failed: Could not set home on user[graphoid]: Execution of '/usr/sbin/usermod -d /nonexistent graphoid' returned 8: usermod: user graphoid is currently used by process 457 Notice: /Stage[main]/Graphoid/Service::Node[graphoid]/File[/var/log/graphoid]: Dependency User[graphoid] has failures: true Warning: /Stage[main]/Graphoid/Service::Node[graphoid]/File[/var/log/graphoid]: Skipping because of failed dependencies Notice: /Stage[main]/Citoid/Service::Node[citoid]/Base::Service_unit[citoid]/Service[citoid]: Dependency User[citoid] has failures: true Warning: /Stage[main]/Citoid/Service::Node[citoid]/Base::Service_unit[citoid]/Service[citoid]: Skipping because of failed dependencies Notice: /Stage[main]/Graphoid/Service::Node[graphoid]/Base::Service_unit[graphoid]/Service[graphoid]: Dependency User[graphoid] has failures: true Warning: /Stage[main]/Graphoid/Service::Node[graphoid]/Base::Service_unit[graphoid]/Service[graphoid]: Skipping because of failed dependencies Notice: /Stage[main]/Ores::Scapdeploy/Scap::Target[ores/deploy]/Package[ores/deploy]/ensure: created Notice: Finished catalog run in 65.30 seconds
I hope this would be helpful for you.
Change 282992 had a related patch set uploaded (by Ladsgroup):
scap: A basic workaround for the git clone issue
I cherry-picked this patch into the beta puppetmaster. So the issue on ORES should be resolved by now.
Change 282992 merged by Alexandros Kosiaris:
scap: A basic workaround for the git clone issue
Puppet status:
status | host | detail |
deployment-sca01 | Success | |
deployment-aqs01 | deployment repo missing from deployment-tin: /srv/deployment/analytics/aqs/deploy does not exist. | |
I tried setting up the AQS deploy repository to fix aqs01 but it's missing .git/DEPLOY_HEAD?
That file is created by deploy --init (or by actually running a full deploy). It's still a manual step at the moment.
Change 285535 had a related patch set uploaded (by Alex Monk):
Fix up deployment-prep scap config