Page MenuHomePhabricator

Auth extremely slow on clouddumps100[12]
Closed, ResolvedPublic

Description

It takes around 25 seconds to log in to the new clouddumps servers. There's a similar delay when sudoing for the first time.

Suspects:

  • ipv6 -> ldap
  • some new bad interaction with hdfs
  • something weird about ntfs which we hacked on this host to get along with hdfs?
  • kerberos

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Aug 24 14:52:46 clouddumps1001 dbus-daemon[1329]: [system] Failed to activate service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)
Aug 24 14:52:46 clouddumps1001 sshd[703570]: pam_systemd(sshd:session): Failed to create session: Failed to activate service 'org.freedesktop.login1': timed out (service_start_timeout=25000ms)

+ Moritz because I think he had a patch in the works. If not let me know and I can likely figure it out :)

Change 826806 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Starting with Bullseye the systemd unit for systemd-logind uses ProtectSystem=strict, which doesn't work with HDFS and results in a failing systemd-logind service.

https://gerrit.wikimedia.org/r/826806

Can you give https://gerrit.wikimedia.org/r/c/operations/puppet/+/826806/ a shot on clouddumps? It should address this. Note that you'll need to reboot the clouddumps, systemd-logind is tricky to restart.

Change 826806 merged by Andrew Bogott:

[operations/puppet@production] Exclude /mnt from systemd-logind restrictions on Bullseye and later

https://gerrit.wikimedia.org/r/826806

All better! Thanks Moritz.