Page MenuHomePhabricator

'fatal: unable to look up current user in the passwd file: no such user
Closed, ResolvedPublic

Description

https://integration.wikimedia.org/ci/job/mediawiki-extensions-hhvm/62240/console

A build on integration-slave-trusty-1004 fails apparently cause it could not lookup the jenkins-deploy user:

00:00:03.858 git.exc.GitCommandError: 'git remote update origin' returned with exit code 1
00:00:03.858 stderr: 'fatal: unable to look up current user in the passwd file: no such user

I can not ssh to the instance, it prompts for a password.

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

Mentioned in SAL [2016-05-13T10:20:49Z] <hashar> Put integration-slave-trusty-1004 offline. Ssh/passwd is borked T135217

At least it is reachable via salt though it is duplicated:

salt -v 'integration-slave-trusty-1004*' test.ping
Executing job with jid 20160513102134079049
-------------------------------------------

integration-slave-trusty-1004.integration.eqiad.wmflabs:
    True
integration-slave-trusty-1004.integration.eqiad.wmflabs:
    True
$ id jenkins-deploy
id: jenkins-deploy: no such user

And from syslog:

May 13 10:25:01 integration-slave-trusty-1004 nslcd[24482]: [2dd275] <group/member="root"> ldap_start_tls_s() failed (uri=ldap://undef:389): Can't contact LDAP server
May 13 10:25:01 integration-slave-trusty-1004 nslcd[24482]: [2dd275] <group/member="root"> failed to bind to LDAP server ldap://undef:389: Can't contact LDAP server
May 13 10:25:01 integration-slave-trusty-1004 nslcd[24482]: [2dd275] <group/member="root"> ldap_start_tls_s() failed (uri=ldap://undef:389): Can't contact LDAP server
May 13 10:25:01 integration-slave-trusty-1004 nslcd[24482]: [2dd275] <group/member="root"> failed to bind to LDAP server ldap://undef:389: Can't contact LDAP server
May 13 10:25:01 integration-slave-trusty-1004 nslcd[24482]: [2dd275] <group/member="root"> no available LDAP server found: Can't contact LDAP server
May 13 10:25:01 integration-slave-trusty-1004 nslcd[24482]: [2dd275] <group/member="root"> no available LDAP server found: Server is unavailable

Which is due to /etc/ldap.conf having an invalid URI:

uri             ldap://undef:389 ldap://undef:389

salt -v 'integration-slave-trusty-1004*' cmd.run 'sed -i -e "s/undef/ldap-labs.eqiad.wikimedia.org/g" /etc/ldap.conf'
salt -v 'integration-slave-trusty-1004*' cmd.run 'sed -i -e "s/undef/ldap-labs.eqiad.wikimedia.org/g" /etc/ldap/ldap.conf'
salt -v 'integration-slave-trusty-1004*' cmd.run 'sed -i -e "s/undef/ldap-labs.eqiad.wikimedia.org/g" /etc/nslcd.conf'
salt -v 'integration-slave-trusty-1004*' cmd.run 'service nslcd restart'

Ran the exact same thing for the integration-puppetmaster.

hashar claimed this task.

Puppet agent managed to run on the integration-puppetmaster and it fixed /etc/ldap.yaml as well

That is all due to some labs transient issue we had yesterday which Andrew notified on labs list.