create-dbusers service failing on labstore1004
Closed, ResolvedPublic

Description

On labstore1004 the create-dbusers service is sometimes failing and Icinga is alarming for the degraded systemd state.

I've seen in the logs this error:

Nov 22 10:49:00 labstore1004 create-dbusers[27604]: No entry found for user uid=tools.admin,ou=people,ou=servicegroups,dc=wikimedia,dc=org in pro
ject tools
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: Traceback (most recent call last):
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: File "/usr/local/sbin/create-dbusers", line 308, in <module>
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: users = User.from_ldap_users(conn, args.project)
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: File "/usr/local/sbin/create-dbusers", line 123, in from_ldap_users
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: attributes=['uid', 'uidNumber']
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: File "/usr/lib/python3/dist-packages/ldap3/core/connection.py", line 614, in search
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: response = self.post_send_search(self.send('searchRequest', request, controls))
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: File "/usr/lib/python3/dist-packages/ldap3/strategy/sync.py", line 142, in post_send_search
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: responses, result = self.get_response(message_id)
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: File "/usr/lib/python3/dist-packages/ldap3/strategy/base.py", line 315, in get_response
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: raise LDAPSessionTerminatedByServerError(self.connection.last_error)
Nov 22 10:49:01 labstore1004 create-dbusers[27604]: ldap3.core.exceptions.LDAPSessionTerminatedByServerError: session terminated by server
Nov 22 10:49:01 labstore1004 systemd[1]: create-dbusers.service: main process exited, code=exited, status=1/FAILURE
Nov 22 10:49:01 labstore1004 systemd[1]: Unit create-dbusers.service entered failed state.

The service was restarted by puppet after a while, in the systemd unit is set as Restart=no.

Related Objects

Volans created this task.Nov 22 2016, 11:06 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 22 2016, 11:06 AM

I believe this may happen periodically as the script makes a connection to both ldap servers (to round robin queries) to determine if new users exist and those servers currently restart randomly due to a memory leak T130593. We could have systemd be smarter about it.

chasemp closed this task as Resolved.Jan 3 2018, 9:22 PM
chasemp claimed this task.

Haven't seen this for a long time now.