Page MenuHomePhabricator

SystemdUnitFailed: grafana-ldap-users-sync.service on grafana1002:9100
Open, Needs TriagePublic

Description

Traceback (most recent call last):
  File "/usr/local/bin/grafana-ldap-users-sync", line 330, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/bin/grafana-ldap-users-sync", line 314, in main
    syncer.sync_ldap_users(ldap_uids, role)
  File "/usr/local/bin/grafana-ldap-users-sync", line 180, in sync_ldap_users
    meta = self.ldap.uid_meta(user)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/bin/grafana-ldap-users-sync", line 68, in uid_meta
    return self.normalize_metadata(result[0][1])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/bin/grafana-ldap-users-sync", line 85, in normalize_metadata
    raise ValueError(f"Invalid {key}: cannot be empty or None")
ValueError: Invalid mail: cannot be empty or None

Event Timeline

tappof updated the task description. (Show Details)
tappof changed the edit policy from "All Users" to "acl*security (Project)".
tappof added a parent task: Restricted Task.
tappof edited projects, added Observability-Alerting; removed Security.

The bug that was eventually recognized and fixed in T339917: wikitech logins set the email address every time blanked a lot of LDAP mail attributes.

bd808@tools-bastion-14.tools.eqiad1:~$ ldap '(&(objectClass=posixAccount)(!(pwdPolicySubentry=cn=disabled,ou=ppolicies,dc=wikimedia,dc=org))(!(mail=*)))' -b 'ou=people,dc=wikimedia,dc=org' dn | grep dn: | wc -l
277

That search finds 277 Developer accounts which are not disabled ((!(pwdPolicySubentry=cn=disabled,ou=ppolicies,dc=wikimedia,dc=org))) which have empty mail attributes ((!(mail=*))).

Change #1243062 had a related patch set uploaded (by Tiziano Fogli; author: Tiziano Fogli):

[operations/puppet@production] ldap_users_sync.py: format code

https://gerrit.wikimedia.org/r/1243062

Change #1243063 had a related patch set uploaded (by Tiziano Fogli; author: Tiziano Fogli):

[operations/puppet@production] ldap_users_sync.py: add non-blocking errors handling

https://gerrit.wikimedia.org/r/1243063

hnowlan moved this task from Inbox to Radar on the SRE Observability board.
hnowlan moved this task from Radar to FY2025/2026-Q3 on the SRE Observability board.

Change #1243063 abandoned by Tiziano Fogli:

[operations/puppet@production] ldap_users_sync.py: add non-blocking errors handling

Reason:

Exploring different approaches

https://gerrit.wikimedia.org/r/1243063

Change #1247076 had a related patch set uploaded (by Tiziano Fogli; author: Tiziano Fogli):

[operations/puppet@production] grafana/ldap_users_sync: delete a user if it has invalid metadata

https://gerrit.wikimedia.org/r/1247076

Thank you @bd808.
After a discussion with the infra-foundations team (@MoritzMuehlenhoff), we’re going to apply a patch that removes users with invalid metadata from the Grafana DB.

What is still unclear to me is this: are we experiencing a new instance of the bug described in T339917: wikitech logins set the email address every time, or are the 277 developer accounts that are enabled but have an empty mail attribute actually fine?

Change #1247523 had a related patch set uploaded (by Tiziano Fogli; author: Tiziano Fogli):

[operations/puppet@production] grafana::ldap_sync: disable systemd timer (TMP)

https://gerrit.wikimedia.org/r/1247523

Change #1243062 merged by Tiziano Fogli:

[operations/puppet@production] ldap_users_sync.py: format code

https://gerrit.wikimedia.org/r/1243062

Change #1247076 merged by Tiziano Fogli:

[operations/puppet@production] grafana/ldap_users_sync: delete a user if it has invalid metadata

https://gerrit.wikimedia.org/r/1247076

Change #1247523 merged by Tiziano Fogli:

[operations/puppet@production] grafana::ldap_sync: disable systemd timer (TMP)

https://gerrit.wikimedia.org/r/1247523

Change #1247532 had a related patch set uploaded (by Tiziano Fogli; author: Tiziano Fogli):

[operations/puppet@production] grafana/ldap_users_sync.py: add missing parameter to sync_ldap_users()

https://gerrit.wikimedia.org/r/1247532

Change #1247532 merged by Tiziano Fogli:

[operations/puppet@production] grafana/ldap_users_sync.py: add missing parameter to sync_ldap_users()

https://gerrit.wikimedia.org/r/1247532

Thank you @bd808.
After a discussion with the infra-foundations team (@MoritzMuehlenhoff), we’re going to apply a patch that removes users with invalid metadata from the Grafana DB.

What is still unclear to me is this: are we experiencing a new instance of the bug described in T339917: wikitech logins set the email address every time, or are the 277 developer accounts that are enabled but have an empty mail attribute actually fine?

I believe that this is a different scenario emerging out of the use of the block function in bitu, which also removes the email address as an option. This is a relatively new pattern aiui and so we're just seeing it occur in this manner now.