It happened on lithium a couple of times, and this morning on wezen. In icinga it is usually listed with the rsyslog service in UNKNOWN status with description Service timeout. The pattern is the following:
top -h shows:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6925 root 20 0 706056 70056 4892 R 99.9 0.4 2441:03 in:imtcp
strace -p 6925 shows:
recvfrom(706, 0x7fc477691650, 31, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(706, 0x7fc477691650, 31, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(706, 0x7fc477691650, 31, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(706, 0x7fc477691650, 31, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(706, 0x7fc477691650, 31, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) recvfrom(706, 0x7fc477691650, 31, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) [...]
Then from /proc/:
elukey@wezen:~$ sudo file /proc/6925/fd/706 /proc/6925/fd/706: broken symbolic link to socket:[118444005]
And finally:
elukey@wezen:~$ sudo lsof | grep 118444005 rsyslogd 6920 root 706u IPv4 118444005 0t0 TCP in:imuxso 6920 6923 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) in:imklog 6920 6924 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) in:imtcp 6920 6925 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) in:imudp 6920 6926 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) rs:main 6920 6927 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) in:imtcp 6920 6928 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) in:imtcp 6920 6929 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) in:imtcp 6920 6930 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED) in:imtcp 6920 6931 root 706u IPv4 118444005 0t0 TCP wezen.codfw.wmnet:syslog-tls->tegmen.wikimedia.org:44892 (ESTABLISHED)
gdb shows (didn't find debug symbols to install):
(gdb) thread apply all bt Thread 1 (Thread 0x7fc483fff700 (LWP 6925)): #0 0x00007fc48dd3cb17 in ?? () from /usr/lib/x86_64-linux-gnu/rsyslog/lmnetstrms.so #1 0x00007fc48db36ff3 in ?? () from /usr/lib/x86_64-linux-gnu/rsyslog/lmtcpsrv.so #2 0x00007fc48db372c7 in ?? () from /usr/lib/x86_64-linux-gnu/rsyslog/lmtcpsrv.so #3 0x00005608e3302970 in ?? () #4 0x00007fc48ff67064 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #5 0x00007fc48f07862d in clone () from /lib/x86_64-linux-gnu/libc.so.6
Usually a restart fixes the problem. Seems very similar to https://github.com/rsyslog/rsyslog/issues/318.
elukey@wezen:~$ dpkg -l | grep rsyslog ii rsyslog 8.23.0-2~bpo8+1 amd64 reliable system and kernel logging daemon ii rsyslog-gnutls 8.23.0-2~bpo8+1 amd64 TLS protocol support for rsyslog