Page MenuHomePhabricator

Puppet Trebuchet provider compares refname with commit sha1 and does NOT refresh the git repo!
Closed, ResolvedPublic

Description

Spot on the beta cluster instance deployment-jobrunner01.eqiad.wmflabs ( T76999 ):

Notice: /Stage[main]/Mediawiki::Jobrunner/Package[jobrunner]/ensure:
 ensure changed '5c927f9091f446452b9fd7bcb69614c7a7fe6eff'
 to 'origin/srv/deployment/jobrunner/jobrunner'

Info: /Stage[main]/Mediawiki::Jobrunner/Package[jobrunner]: Scheduling refresh of Service[jobrunner]
Notice: /Stage[main]/Mediawiki::Jobrunner/Service[jobrunner]: Triggered 'refresh' from 1 events

On the instance the git repo for jobrunner is under /srv/deployment/jobrunner/jobrunner and:

$ git rev-parse HEAD
5c927f9091f446452b9fd7bcb69614c7a7fe6eff

Seem that the Trebuchet ruby provider should resolve the reference origin/srv/deployment/jobrunner/jobrunner to a sha1 before comparing with the local HEAD.

Additionally I found out the repository to NOT be up to date:

$ git rev-parse origin/HEAD
13852d0b77bd5fadaecb45fd3f0fc5556bea8407

$ git log --decorate --oneline HEAD^..origin/HEAD
13852d0 (origin/master, origin/HEAD) Add logging to track down OOM
         ^^^^^^^^^^^^^^^^^^^^^^^^^^
23acfaf Removed obsolete TODO comment
316593f Use LLEN instead of lSize (a binding alias)
d426235 Recover better from aggregator server network partitions
ff2be2e Merge "Various documentation updates"
3e29c22 Removed extra "tries" increment in the redis job queue
d2b0ba7 Various documentation updates
0b6fcc5 Timestamp debug log messages, too.
794bd88 Update path reference for /srv/mediawiki
f5fde2a Avoid a few notices if proc_open() fails
19c880d Lower the delay when no jobs are available
795baf3 Configuration rewrite, part 2
5c927f9 (HEAD, master) trim() the stderr output too
         ^^^^^^^^^^^^
$

Event Timeline

hashar raised the priority of this task from to Needs Triage.
hashar updated the task description. (Show Details)
hashar added projects: acl*sre-team, Puppet.
hashar changed Security from none to None.
hashar added subscribers: Aklapper, hashar, ori.
fgiunchedi triaged this task as High priority.Dec 23 2014, 11:53 AM
fgiunchedi added a subscriber: fgiunchedi.

who is the appropriate assignee for this? I have no idea.

hashar assigned this task to ori.Jan 7 2015, 3:02 PM

who is the appropriate assignee for this? I have no idea.

Ori wrote the puppet ruby provider for Trebuchet. Ori mind investigating further please?

ori added a comment.Jan 15 2015, 3:49 AM

Can't debug this at the moment because Puppet is broken on the relevant host:

Error: Could not retrieve catalog from remote server: Error 400 on SERVER:
Duplicate declaration: Sudo::Group[ops] is already declared in file
/etc/puppet/manifests/role/labs.pp:12; cannot redeclare at
/etc/puppet/modules/admin/manifests/group.pp:39 on node i-0000022e.eqiad.wmflabs
chasemp lowered the priority of this task from High to Medium.Mar 11 2015, 8:57 PM
chasemp edited projects, added Cloud-Services; removed acl*sre-team.
ori added a comment.Jul 23 2015, 1:20 AM

@hashar is this still an issue?

hashar closed this task as Resolved.Jul 24 2015, 1:24 PM

Ran puppet with --debug on deployment-jobrunner01.deployment-prep.eqiad.wmflabs, I have inserted the sha1 reported by the various rev-parse commands.

Debug: Executing '/usr/bin/salt-call --log-level=quiet --out=json --local grains.get deployment_target'
Debug: Executing '/usr/bin/git --git-dir /srv/deployment/jobrunner/jobrunner/.git rev-parse HEAD'
                 --> 9e42f5b0ae056533b8ca27a67ed98b9219c67dbc

Debug: Executing '/usr/bin/salt-call --log-level=quiet --out=json --local grains.get trebuchet_master'
Debug: Executing '/usr/bin/git --git-dir /srv/deployment/jobrunner/jobrunner/.git ls-remote origin --tags refs/tags/jobrunner/jobrunner-sync-20150529-071029'
                 --> 9e42f5b0ae056533b8ca27a67ed98b9219c67dbc	refs/tags/jobrunner/jobrunner-sync-20150529-071029

Debug: Executing '/usr/bin/git --git-dir /srv/deployment/jobrunner/jobrunner/.git rev-parse HEAD'
                 --> 9e42f5b0ae056533b8ca27a67ed98b9219c67dbc
Debug: Executing '/usr/bin/salt-call --log-level=quiet --out=json --local service.status salt-minion'

And it is no more refreshing the service \O/ Seems the comparaison is now smarter by comparing sha1 of HEAD with the sha1 of the tag on the deployment server.

So yeah fixed somehow in Trebuchet. Kudos!


Looking at a simplified graph of the git repo (--simplify-by-decoration):

git log --simplify-by-decoration  --oneline --decorate --graph HEAD master origin/master origin/HEAD jobrunner/jobrunner-sync-20150529-071029
* 9e42f5b (HEAD, tag: jobrunner/jobrunner-sync-20150529-071029) Added PeriodicScriptParamsIterator class to avoid OOMs
* 025ad8d (tag: jobrunner/jobrunner-sync-20150527-210055, tag: jobrunner/jobrunner-start-20150527-210155, tag: jobrunner/jobrunner-start-20150527-210049, origin/master, origin/HEAD) Revert "
* fa62a0e (tag: jobrunner/jobrunner-sync-20150204-145353, master) Make setting port number in statsd config optional
* 5c927f9 (tag: jobrunner/jobrunner-sync-20140808-143026, tag: jobrunner/jobrunner-start-20150204-145331) trim() the stderr output too
* 949ebce (tag: jobrunner/jobrunner-start-20140808-143005) Merge "Added UDP stat calls to match the MediaWiki Redis queue class"
* 3f98e18 Initial commit

The origin/master and origin/HEAD are updated nor the local master branch. The repository has the proper tag albeit in detached mode: * (detached from jobrunner/jobrunner-sync-20150529-071029)

Probably not an issue though. Potentially it could confuses people.


Closing this task since HEAD / deployment tag is now properly handled and consistent.