Page MenuHomePhabricator

Toolforge Stretch - Increased LDAP utilization
Closed, DuplicatePublic

Description

Toolforge Stretch seems to be making more LDAP requests than Toolforge Trusty.

https://phabricator.wikimedia.org/T217280#5008363

Email announcement about Toolforge Trusty deprecation (Feb 7): https://lists.wikimedia.org/pipermail/cloud/2019-February/000533.html

And LDAP restarts:

Event Timeline

GTirloni triaged this task as High priority.Mar 7 2019, 3:10 PM
GTirloni created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 7 2019, 3:10 PM
GTirloni updated the task description. (Show Details)Mar 7 2019, 3:13 PM
bd808 added a subscriber: hashar.Mar 7 2019, 9:47 PM

In T204681#4603457 @hashar tracked down some great history on getgrent and its lack of interaction with nscd. Looking in the source of son of grid engine:

$ git grep -l getgrent
source/3rdparty/qtcsh/tw.init.c
source/libs/uti2/util.c
source/scripts/nonreentrant.sh
source/utilbin/authuser.c
  • source/3rdparty/qtcsh/tw.init.c
    • tw_grpname_next group iterator used by tw_collect_items in the custom tcsh implementation
  • source/libs/uti2/util.c
    • provides an implementation of getgrouplist based on getgrent for systems which do not supply the former
  • source/scripts/nonreentrant.sh
    • grep patterns for finding non-reentrant function use including getgrent
  • source/utilbin/authuser.c
    • function juti_getgrouplist uses getgrent to build up a list of groups when authenticating a user via the authuser binary

None of these jumps out at me as a smoking gun, but I only spent about 20 minutes looking.

I'm comparing the ldap/pam behavior of test stretch, jessie and trusty VMs. So far I don't see any significant difference.

Of course within toolforge this could be many different things

I'm checking tools-sgeexec-0914 (which according to T217280#5008363 is the worst host) with my hammer, so:

root@tools-sgeexec-0914:~# strace -p 694 -p 726 -p 727 -p 728 -p 729 -p 730 -s 1024 -e getsockopt -t
strace: Process 694 attached
[...]
strace: Process 730 attached
[pid   728] 00:45:04 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   726] 00:45:11 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=697, uid=600, gid=600}, [12]) = 0
[pid   729] 00:45:11 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=11136, uid=600, gid=600}, [12]) = 0
[pid   728] 00:45:16 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=11136, uid=600, gid=600}, [12]) = 0
[pid   729] 00:45:18 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   729] 00:45:18 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   729] 00:45:18 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   729] 00:45:18 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   729] 00:45:57 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   729] 00:46:01 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   728] 00:46:04 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   730] 00:46:12 getsockopt(10, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   729] 00:46:19 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
[pid   729] 00:47:04 getsockopt(4, SOL_SOCKET, SO_PEERCRED, {pid=543, uid=0, gid=0}, [12]) = 0
^Cstrace: Process 694 detached
[...]
strace: Process 730 detached
root@tools-sgeexec-0914:~# ps u 543
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       543  0.0  0.0 1536700 4340 ?        Ssl  Feb25   0:49 /usr/sbin/nscd

nscd accounts for the vast majority of requests to nslcd.

nscd does not do getsockopt, and which thread is connecting seems quite random. It's not as simple to figure out who is abusing nscd.

One sample request in nslcd looks like (stracing all threads of nscd):

1[pid 543] 00:52:01 epoll_wait(4, [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1
2[pid 543] 00:52:04 accept4(14, NULL, NULL, SOCK_NONBLOCK) = 16
3[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_ADD, 16, {EPOLLRDNORM, {u32=16, u64=16}}) = 0
4[pid 543] 00:52:04 epoll_wait(4, [{EPOLLRDNORM, {u32=16, u64=16}}], 100, 29964) = 1
5[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_DEL, 16, NULL) = 0
6[pid 543] 00:52:04 futex(0x557b952db844, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x557b952db840, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
7[pid 14963] 00:52:04 <... futex resumed> ) = 0
8[pid 543] 00:52:04 <... futex resumed> ) = 1
9[pid 14963] 00:52:04 futex(0x557b952db800, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
10[pid 543] 00:52:04 epoll_wait(4, <unfinished ...>
11[pid 14963] 00:52:04 <... futex resumed> ) = 0
12[pid 14963] 00:52:04 read(16, "\2\0\0\0\v\0\0\0\7\0\0\0", 12) = 12
13[pid 14963] 00:52:04 read(16, "passwd\0", 7) = 7
14[pid 14963] 00:52:04 sendmsg(16, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="passwd\0", iov_len=7}, {iov_base="h\23/\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[7]}], msg_controllen=20, msg_flags=0}, MSG_NOSIGNAL) = 15
15[pid 14963] 00:52:04 close(16 <unfinished ...>
16[pid 543] 00:52:04 <... epoll_wait resumed> [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1
17[pid 14963] 00:52:04 <... close resumed> ) = 0
18[pid 543] 00:52:04 accept4(14, NULL, NULL, SOCK_NONBLOCK <unfinished ...>
19[pid 14963] 00:52:04 futex(0x557b952db844, FUTEX_WAIT_PRIVATE, 560895, NULL <unfinished ...>
20[pid 543] 00:52:04 <... accept4 resumed> ) = 16
21[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_ADD, 16, {EPOLLRDNORM, {u32=16, u64=16}}) = 0
22[pid 543] 00:52:04 epoll_wait(4, [{EPOLLRDNORM, {u32=16, u64=16}}], 100, 29964) = 1
23[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_DEL, 16, NULL) = 0
24[pid 543] 00:52:04 futex(0x557b952db844, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x557b952db840, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
25[pid 22727] 00:52:04 <... futex resumed> ) = 0
26[pid 543] 00:52:04 <... futex resumed> ) = 1
27[pid 22727] 00:52:04 futex(0x557b952db800, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
28[pid 543] 00:52:04 epoll_wait(4, <unfinished ...>
29[pid 22727] 00:52:04 <... futex resumed> ) = 0
30[pid 22727] 00:52:04 read(16, "\2\0\0\0\f\0\0\0\6\0\0\0", 12) = 12
31[pid 22727] 00:52:04 read(16, "group\0", 6) = 6
32[pid 22727] 00:52:04 sendmsg(16, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="group\0", iov_len=6}, {iov_base="h\23/\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[9]}], msg_controllen=20, msg_flags=0}, MSG_NOSIGNAL) = 14
33[pid 22727] 00:52:04 close(16) = 0
34[pid 22727] 00:52:04 futex(0x557b952db844, FUTEX_WAIT_PRIVATE, 560897, NULL <unfinished ...>
35[pid 543] 00:52:04 <... epoll_wait resumed> [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1
36[pid 543] 00:52:04 accept4(14, NULL, NULL, SOCK_NONBLOCK) = 16
37[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_ADD, 16, {EPOLLRDNORM, {u32=16, u64=16}}) = 0
38[pid 543] 00:52:04 epoll_wait(4, [{EPOLLRDNORM, {u32=16, u64=16}}], 100, 29964) = 1
39[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_DEL, 16, NULL) = 0
40[pid 543] 00:52:04 futex(0x557b952db844, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x557b952db840, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
41[pid 571] 00:52:04 <... futex resumed> ) = 0
42[pid 571] 00:52:04 futex(0x557b952db800, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
43[pid 543] 00:52:04 epoll_wait(4, <unfinished ...>
44[pid 571] 00:52:04 <... futex resumed> ) = 0
45[pid 571] 00:52:04 read(16, "\2\0\0\0\r\0\0\0\6\0\0\0", 12) = 12
46[pid 571] 00:52:04 read(16, "hosts\0", 6) = 6
47[pid 571] 00:52:04 sendmsg(16, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="hosts\0", iov_len=6}, {iov_base="\330k\17\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[11]}], msg_controllen=20, msg_flags=0}, MSG_NOSIGNAL) = 14
48[pid 571] 00:52:04 close(16) = 0
49[pid 571] 00:52:04 futex(0x557b952db844, FUTEX_WAIT_PRIVATE, 560899, NULL <unfinished ...>
50[pid 543] 00:52:04 <... epoll_wait resumed> [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1
51[pid 543] 00:52:04 accept4(14, NULL, NULL, SOCK_NONBLOCK) = 16
52[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_ADD, 16, {EPOLLRDNORM, {u32=16, u64=16}}) = 0
53[pid 543] 00:52:04 epoll_wait(4, [{EPOLLRDNORM, {u32=16, u64=16}}], 100, 29964) = 1
54[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_DEL, 16, NULL) = 0
55[pid 543] 00:52:04 futex(0x557b952db844, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x557b952db840, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
56[pid 570] 00:52:04 <... futex resumed> ) = 0
57[pid 543] 00:52:04 <... futex resumed> ) = 1
58[pid 570] 00:52:04 futex(0x557b952db800, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
59[pid 543] 00:52:04 epoll_wait(4, <unfinished ...>
60[pid 570] 00:52:04 <... futex resumed> ) = 0
61[pid 570] 00:52:04 read(16, "\2\0\0\0\17\0\0\0\7\0\0\0", 12) = 12
62[pid 570] 00:52:04 read(16, "puppet\0", 7) = 7
63[pid 570] 00:52:04 open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 17
64[pid 570] 00:52:04 read(17, "65536\n", 31) = 6
65[pid 570] 00:52:04 close(17) = 0
66[pid 570] 00:52:04 open("/etc/group", O_RDONLY|O_CLOEXEC) = 17
67[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 0
68[pid 570] 00:52:04 fstat(17, {st_mode=S_IFREG|0644, st_size=836, ...}) = 0
69[pid 570] 00:52:04 read(17, "root:x:0:\ndaemon:x:1:\nbin:x:2:\nsys:x:3:\nadm:x:4:\ntty:x:5:\ndisk:x:6:\nlp:x:7:\nmail:x:8:\nnews:x:9:\nuucp:x:10:\nman:x:12:\nproxy:x:13:\nkmem:x:15:\ndialout:x:20:\nfax:x:21:\nvoice:x:22:\ncdrom:x:24:\nfloppy:x:25:\ntape:x:26:\nsudo:x:27:\naudio:x:29:\ndip:x:30:\nwww-data:x:33:\nbackup:x:34:\noperator:x:37:\nlist:x:38:\nirc:x:39:\nsrc:x:40:\ngnats:x:41:\nshadow:x:42:\nutmp:x:43:\nvideo:x:44:\nsasl:x:45:\nplugdev:x:46:\nstaff:x:50:\ngames:x:60:\nusers:x:100:\nnogroup:x:65534:\nsystemd-journal:x:101:\nsystemd-timesync:x:102:\nsystemd-network:x:103:\nsystemd-resolve:x:104:\nsystemd-bus-proxy:x:105:\ninput:x:106:\ncrontab:x:107:\nnetdev:x:108:\nnagios:x:109:\nDebian-exim:x:110:\n_lldpd:x:111:\nprometheus:x:112:\nssl-cert:x:114:\nmessagebus:x:115:\nnslcd:x:116:\nssh:x:117:\ndiamond:x:118:\npuppet:x:119:\nprometheus-node-exporter:x:53939:\nssh-key-ldap-lookup:x:499:\nclushuser:x:498:\n", 4096) = 836
70[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
71[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
72[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
73[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
74[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
75[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
76[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
77[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
78[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
79[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
80[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
81[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
82[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
83[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
84[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
85[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
86[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
87[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
88[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
89[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
90[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
91[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
92[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
93[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
94[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
95[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
96[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
97[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
98[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
99[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
100[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
101[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
102[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
103[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
104[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
105[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
106[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
107[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
108[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
109[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
110[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
111[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
112[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
113[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
114[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
115[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
116[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
117[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
118[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
119[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
120[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
121[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
122[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
123[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
124[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
125[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
126[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
127[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
128[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
129[pid 570] 00:52:04 lseek(17, 0, SEEK_CUR) = 836
130[pid 570] 00:52:04 read(17, "", 4096) = 0
131[pid 570] 00:52:04 close(17) = 0
132[pid 570] 00:52:04 socket(AF_UNIX, SOCK_STREAM, 0) = 17
133[pid 570] 00:52:04 fcntl(17, F_GETFL) = 0x2 (flags O_RDWR)
134[pid 570] 00:52:04 fcntl(17, F_SETFD, FD_CLOEXEC|0x2) = 0
135[pid 570] 00:52:04 connect(17, {sa_family=AF_UNIX, sun_path="/var/run/nslcd/socket"}, 23) = 0
136[pid 570] 00:52:04 poll([{fd=17, events=POLLOUT}], 1, 10000) = 1 ([{fd=17, revents=POLLOUT}])
137[pid 570] 00:52:04 sendto(17, "\0\0\0\2\0\4\0\6\0\0\0\6puppet", 18, MSG_NOSIGNAL, NULL, 0) = 18
138[pid 570] 00:52:04 poll([{fd=17, events=POLLIN}], 1, 60000) = 1 ([{fd=17, revents=POLLIN|POLLHUP}])
139[pid 570] 00:52:04 read(17, "\0\0\0\2\0\4\0\6\0\0\0\2", 1024) = 12
140[pid 570] 00:52:04 close(17) = 0
141[pid 570] 00:52:04 sendto(16, "\2\0\0\0\0\0\0\0\0\0\0\0", 12, MSG_NOSIGNAL, NULL, 0) = 12
142[pid 570] 00:52:04 msync(0x7f3c9bc95000, 359, MS_ASYNC) = 0
143[pid 570] 00:52:04 msync(0x7f3c9bc92000, 8196, MS_ASYNC) = 0
144[pid 570] 00:52:04 close(16) = 0
145[pid 570] 00:52:04 futex(0x557b952db844, FUTEX_WAIT_PRIVATE, 560901, NULL <unfinished ...>
146[pid 543] 00:52:04 <... epoll_wait resumed> [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1
147[pid 543] 00:52:04 accept4(14, NULL, NULL, SOCK_NONBLOCK) = 16
148[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_ADD, 16, {EPOLLRDNORM, {u32=16, u64=16}}) = 0
149[pid 543] 00:52:04 epoll_wait(4, [{EPOLLRDNORM, {u32=16, u64=16}}], 100, 29964) = 1
150[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_DEL, 16, NULL) = 0
151[pid 543] 00:52:04 futex(0x557b952db844, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x557b952db840, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
152[pid 569] 00:52:04 <... futex resumed> ) = 0
153[pid 543] 00:52:04 epoll_wait(4, <unfinished ...>
154[pid 569] 00:52:04 futex(0x557b952db800, FUTEX_WAKE_PRIVATE, 1) = 0
155[pid 569] 00:52:04 read(16, "\2\0\0\0\v\0\0\0\7\0\0\0", 12) = 12
156[pid 569] 00:52:04 read(16, "passwd\0", 7) = 7
157[pid 569] 00:52:04 sendmsg(16, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="passwd\0", iov_len=7}, {iov_base="h\23/\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[7]}], msg_controllen=20, msg_flags=0}, MSG_NOSIGNAL) = 15
158[pid 543] 00:52:04 <... epoll_wait resumed> [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1
159[pid 569] 00:52:04 close(16 <unfinished ...>
160[pid 543] 00:52:04 accept4(14, NULL, NULL, SOCK_NONBLOCK <unfinished ...>
161[pid 569] 00:52:04 <... close resumed> ) = 0
162[pid 543] 00:52:04 <... accept4 resumed> ) = 16
163[pid 569] 00:52:04 futex(0x557b952db844, FUTEX_WAIT_PRIVATE, 560903, NULL <unfinished ...>
164[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_ADD, 16, {EPOLLRDNORM, {u32=16, u64=16}}) = 0
165[pid 543] 00:52:04 epoll_wait(4, [{EPOLLRDNORM, {u32=16, u64=16}}], 100, 29964) = 1
166[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_DEL, 16, NULL) = 0
167[pid 543] 00:52:04 futex(0x557b952db844, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x557b952db840, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
168[pid 568] 00:52:04 <... futex resumed> ) = 0
169[pid 543] 00:52:04 <... futex resumed> ) = 1
170[pid 568] 00:52:04 futex(0x557b952db800, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
171[pid 543] 00:52:04 epoll_wait(4, <unfinished ...>
172[pid 568] 00:52:04 <... futex resumed> ) = 0
173[pid 568] 00:52:04 read(16, "\2\0\0\0\f\0\0\0\6\0\0\0", 12) = 12
174[pid 568] 00:52:04 read(16, "group\0", 6) = 6
175[pid 568] 00:52:04 sendmsg(16, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="group\0", iov_len=6}, {iov_base="h\23/\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[9]}], msg_controllen=20, msg_flags=0}, MSG_NOSIGNAL) = 14
176[pid 568] 00:52:04 close(16) = 0
177[pid 568] 00:52:04 futex(0x557b952db844, FUTEX_WAIT_PRIVATE, 560905, NULL <unfinished ...>
178[pid 543] 00:52:04 <... epoll_wait resumed> [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1
179[pid 543] 00:52:04 accept4(14, NULL, NULL, SOCK_NONBLOCK) = 16
180[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_ADD, 16, {EPOLLRDNORM, {u32=16, u64=16}}) = 0
181[pid 543] 00:52:04 epoll_wait(4, [{EPOLLRDNORM, {u32=16, u64=16}}], 100, 29964) = 1
182[pid 543] 00:52:04 epoll_ctl(4, EPOLL_CTL_DEL, 16, NULL) = 0
183[pid 543] 00:52:04 futex(0x557b952db844, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x557b952db840, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1 <unfinished ...>
184[pid 564] 00:52:04 <... futex resumed> ) = 0
185[pid 543] 00:52:04 <... futex resumed> ) = 1
186[pid 564] 00:52:04 futex(0x557b952db800, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
187[pid 543] 00:52:04 epoll_wait(4, <unfinished ...>
188[pid 564] 00:52:04 <... futex resumed> ) = 0
189[pid 564] 00:52:04 read(16, "\2\0\0\0\r\0\0\0\6\0\0\0", 12) = 12
190[pid 564] 00:52:04 read(16, "hosts\0", 6) = 6
191[pid 564] 00:52:04 sendmsg(16, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="hosts\0", iov_len=6}, {iov_base="\330k\17\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[11]}], msg_controllen=20, msg_flags=0}, MSG_NOSIGNAL) = 14
192[pid 564] 00:52:04 close(16) = 0
193[pid 564] 00:52:04 futex(0x557b952db844, FUTEX_WAIT_PRIVATE, 560907, NULL <unfinished ...>
194[pid 561] 00:52:04 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out)
195[pid 561] 00:52:04 futex(0x557b952db248, FUTEX_WAKE_PRIVATE, 1) = 0
196[pid 561] 00:52:04 futex(0x557b952db21c, FUTEX_WAIT_BITSET_PRIVATE, 92299, {tv_sec=869296, tv_nsec=513510400}, 0xffffffff <unfinished ...>
197[pid 543] 00:52:07 <... epoll_wait resumed> [{EPOLLRDNORM, {u32=14, u64=14}}], 100, 29976) = 1

These group requests to nscd accounts for a minority of requests to nscd.

root@tools-sgeexec-0914:~# strace -s 1024 -p 543 -p 560 -p 561 -p 562 -p 563 -p 564 -p 568 -p 569 -p 570 -p 571 -p 22727 -p 14963 -t 2>&1 | grep nslcd | while read t; do TID=$(echo "$t" | awk '{print $2}' | tr -d ']'); kill -STOP $TID; lsof +E -aUp 543; kill -CONT $TID; break; done
COMMAND     PID USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
systemd       1 root   49u  unix 0xffff88c271f91000      0t0     1649 /run/systemd/journal/dev-log type=DGRAM ->INO=14432 543,nscd,5u
systemd-j   228 root    6u  unix 0xffff88c271f91000      0t0     1649 /run/systemd/journal/dev-log type=DGRAM ->INO=14432 543,nscd,5u
nscd        543 root    5u  unix 0xffff88c275681400      0t0    14432 type=DGRAM ->INO=1649 228,systemd-j,6u 1,systemd,49u
nscd        543 root   14u  unix 0xffff88c2735eb800      0t0    14453 /var/run/nscd/socket type=STREAM
nscd        543 root   16u  unix 0xffff88c244448800      0t0 14111659 /var/run/nscd/socket type=STREAM ->INO=14122005 17658,sudo,10u
nscd        543 root   17u  unix 0xffff88c1a902bc00      0t0 14120443 type=STREAM
sudo      17658 root   10u  unix 0xffff88c224571c00      0t0 14122005 type=STREAM ->INO=14111659 543,nscd,16u
^C
root@tools-sgeexec-0914:~# strace -s 1024 -p 543 -p 560 -p 561 -p 562 -p 563 -p 564 -p 568 -p 569 -p 570 -p 571 -p 22727 -p 14963 -t 2>&1 | grep nslcd | while read t; do TID=$(echo "$t" | awk '{print $2}' | tr -d ']'); kill -STOP $TID; lsof +E -aUp 543; kill -CONT $TID; break; done
COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF     NODE NAME
systemd     1 root   49u  unix 0xffff88c271f91000      0t0     1649 /run/systemd/journal/dev-log type=DGRAM ->INO=14432 543,nscd,5u
systemd-j 228 root    6u  unix 0xffff88c271f91000      0t0     1649 /run/systemd/journal/dev-log type=DGRAM ->INO=14432 543,nscd,5u
nscd      543 root    5u  unix 0xffff88c275681400      0t0    14432 type=DGRAM ->INO=1649 228,systemd-j,6u 1,systemd,49u
nscd      543 root   14u  unix 0xffff88c2735eb800      0t0    14453 /var/run/nscd/socket type=STREAM
nscd      543 root   16u  unix 0xffff88c27433ac00      0t0 14128549 type=STREAM
^C

Both times point to journald. Now stracing journald to see if that is indeed asking for groups all the time.

Now stracing journald to see if that is indeed asking for groups all the time.

Doesn't seem like it.

One can also enable logging for nscd/nslcd:

  1. touch /var/log/nscd.log /var/log/nslcd.log
  2. chown nslcd:nslcd /var/log/nslcd.log

Then:

/etc/nscd.conf
logfile /var/log/nscd.log
debug-level 4
/etc/nslcd.conf
# Default is: syslog info
log /var/log/nslcd.log debug

And restart the daemons ;)

Andrew added a comment.EditedMar 8 2019, 8:38 PM

I've been watching ldap traffic (with tcpdump) while starting and stopping jobs on particular grid nodes. The Stretch grid produces about 50-60 times as many ldap requests when a job starts as the old grid.

I'm testing this by scheduling a 5-second-sleep job on a particular node with e.g.

qsub -q task@tools-exec-1430.tools.eqiad.wmflabs -e ./error.log -o ./output.log -l h_vmem=500M ./trivial.sh

Andrew added a comment.Mar 8 2019, 9:45 PM

name=/etc/nslcd.conf

Default is: syslog info

log /var/log/nslcd.log debug

That turns out to only be supported in 0.9.x which is on Stretch but not on Trusty. On older systems the only option I can find is to run nslcd -d on the commandline

It seems moderately possible that we need to install the 'deref' overlay on openldap to catch up with the new expectations of nslcd. That's only barely supported, but one brave soul seems to have made it work: https://www.openldap.org/lists/openldap-technical/201401/msg00025.html

Here's the kind of thing nslcd is doing on stretch:

https://people.wikimedia.org/~andrew/nslcdgroups.log

bd808 added a comment.Mar 9 2019, 2:07 AM

It seems moderately possible that we need to install the 'deref' overlay on openldap to catch up with the new expectations of nslcd. That's only barely supported, but one brave soul seems to have made it work: https://www.openldap.org/lists/openldap-technical/201401/msg00025.html

That brave soul is the maintainer of nss-pam-ldapd which is the most likely cause of any change we are seeing in LDAP requests between Trusty and Stretch. :)

Andrew added a comment.EditedMar 9 2019, 9:44 PM
root@serpens:/etc/ldap# grep deref slapd.conf
moduleload  deref
overlay deref

root@serpens:/etc/ldap# slapd -V
@(#) $OpenLDAP: slapd  (May 23 2018 04:25:19) $
	Debian OpenLDAP Maintainers <pkg-openldap-devel@lists.alioth.debian.org>

Nice version number there :/

It seems the deref overlay is present already. There might be more needed to enable it but I can't really see what.

As a side note, for the near future I think we should consider using sssd rather than nslcd/nscd. https://en.wikipedia.org/wiki/System_Security_Services_Daemon

As a side note, for the near future I think we should consider using sssd rather than nslcd/nscd. https://en.wikipedia.org/wiki/System_Security_Services_Daemon

+1
Hopefully that doesn't turn out to be the fix for this, though. 😁

GTirloni removed a subscriber: GTirloni.Mar 21 2019, 9:06 PM
bd808 added a comment.Aug 8 2019, 4:06 AM

As a side note, for the near future I think we should consider using sssd rather than nslcd/nscd. https://en.wikipedia.org/wiki/System_Security_Services_Daemon

+1
Hopefully that doesn't turn out to be the fix for this, though. 😁

Turns out that this was a big part of the fix.