Page MenuHomePhabricator

Puppet on stat1003 keeps failing for git errors
Closed, ResolvedPublic

Description

Puppet on stat1003 keeps failing:

Notice: /Stage[main]/Statistics::User/User[stats]/groups: groups changed '' to 'wikidev'
Notice: /Stage[main]/Geowiki::Job::Limn/Git::Clone[geowiki-data-public]/Exec[git_pull_geowiki-data-public]/returns: Permission denied (publickey).
Notice: /Stage[main]/Geowiki::Job::Limn/Git::Clone[geowiki-data-public]/Exec[git_pull_geowiki-data-public]/returns: fatal: Could not read from remote repository.
Notice: /Stage[main]/Geowiki::Job::Limn/Git::Clone[geowiki-data-public]/Exec[git_pull_geowiki-data-public]/returns:
Notice: /Stage[main]/Geowiki::Job::Limn/Git::Clone[geowiki-data-public]/Exec[git_pull_geowiki-data-public]/returns: Please make sure you have the correct access rights
Notice: /Stage[main]/Geowiki::Job::Limn/Git::Clone[geowiki-data-public]/Exec[git_pull_geowiki-data-public]/returns: and the repository exists.
Error: /usr/bin/git  pull --quiet returned 1 instead of one of [0]
Error: /Stage[main]/Geowiki::Job::Limn/Git::Clone[geowiki-data-public]/Exec[git_pull_geowiki-data-public]/returns: change from notrun to 0 failed: /usr/bin/git  pull --quiet returned 1 instead of one of [0]
Notice: /Stage[main]/Geowiki::Job::Limn/Cron[geowiki-process-db-to-limn]: Dependency Exec[git_pull_geowiki-data-public] has failures: true
Warning: /Stage[main]/Geowiki::Job::Limn/Cron[geowiki-process-db-to-limn]: Skipping because of failed dependencies

Also, commits to the geowiki-publicdata stopped some days ago (probably when @Ottomata changed the LDAP email?):

https://github.com/wikimedia/analytics-geowiki-data-public/commits/master

I have no idea if the two things are connected, but puppet needs to run of course :)

Last but not the least, I am seeing a lot of CPU temperature alarms on this machine (that is running hot since days ago).

Event Timeline

Last but not the least, I am seeing a lot of CPU temperature alarms on this machine (that is running hot since days ago).

You should open an ops-eqiad ticket to have thermal paste applied.

I think you're right, @elukey. Right now the remote it's trying to pull from is configured as ssh://gerrit.wikimedia.org:29418/analytics/geowiki/data-public.git which I didn't even know worked (I thought you always had to specify like milimetric@gerrit.... So I guess if it was running under Andrew's user name, it was working regardless of the ssh key problem?

@Milimetric I think that it is running with the "stat" username:

git::clone { 'geowiki-data-public':
    ensure    => 'latest',
    directory => $::geowiki::params::public_data_path,
    origin    => 'ssh://gerrit.wikimedia.org:29418/analytics/geowiki/data-public.git',
    owner     => $::geowiki::params::user,
    group     => $::geowiki::params::user,
}

Didn't find any change in the private git repo, puppet class and gerrit..

elukey@terbium:~$ ldaplist -l passwd stats

Shows two public keys, meanwhile:

elukey@stat1003:~$ sudo -u stats ssh -vv -p 29418 gerrit.wikimedia.org

tries to use keys undet /var/lib/stats/.ssh

But I am not able to figure out what public key Stats uses in gerrit:

https://gerrit.wikimedia.org/r/#/admin/groups/uuid-a89e365b66e1bf7c442e7a32bb18887658b0d198,members

The Stats gerrit password is in pwstore, comparing the one used on stats1003 with the one registered.

Milimetric triaged this task as Unbreak Now! priority.Apr 12 2016, 4:13 PM

Added the /var/lib/stats/.ssh to the Stats user in gerrit, puppet runs fine now!

@Ottomata: can you double check that everything is correct?

elukey lowered the priority of this task from Unbreak Now! to Medium.Apr 13 2016, 10:24 AM
elukey moved this task from In Progress to In Code Review on the Analytics-Kanban board.

The problem was that the stat’s user’s git config email did not match what
was in gerrit, and it did not have forge committer identity rights. This
was fixed on Friday.