On March 8 00:30:32 the pdfrender service on scb1003 got a SIGTERM (from systemd) and shutdown gracefully. This was part of a normal deploy and is documented in SAL in https://tools.wmflabs.org/sal/production?d=2017-03-08.
However after the being started normally, the process is not listening on its TCP port and icinga has alerted on it.
connect to address 10.64.32.153 and port 5252: Connection refused
The processes seem to be running fine according to systemctl status pdfrender
* pdfrender.service - "pdfrender service" Loaded: loaded (/lib/systemd/system/pdfrender.service; enabled) Active: active (running) since Wed 2017-03-08 00:30:33 UTC; 8h ago Main PID: 30726 (firejail) CGroup: /system.slice/pdfrender.service |-30726 /usr/bin/firejail --profile=/etc/firejail/pdfrender.profile /usr/bin/nodejs /srv/deployment/electron-render/deploy/src... |-30728 /usr/bin/python /usr/bin/xpra start :427 --no-daemon |-30731 Xorg-for-Xpra-:427 -dpi 96 -noreset -nolisten tcp +extension GLX +extension RANDR +extension RENDER -logfile /home/pdf... |-30805 /usr/bin/firejail --profile=/etc/firejail/pdfrender.profile /usr/bin/nodejs /srv/deployment/electron-render/deploy/src... |-30806 /usr/bin/firejail --profile=/etc/firejail/pdfrender.profile /usr/bin/nodejs /srv/deployment/electron-render/deploy/src... |-30813 /usr/bin/nodejs /srv/deployment/electron-render/deploy/src/bin/electron-render-service.js |-30819 /srv/deployment/electron-render/deploy-cache/revs/5ec56146bd70cd14c62a42cfd5a9d1ae4e58c14d/node_modules/electron-prebu... |-30852 /srv/deployment/electron-render/deploy-cache/revs/5ec56146bd70cd14c62a42cfd5a9d1ae4e58c14d/node_modules/electron-prebu... `-30882 /srv/deployment/electron-render/deploy-cache/revs/5ec56146bd70cd14c62a42cfd5a9d1ae4e58c14d/node_modules/electron-prebu...
journal has some logs (maybe helpful, maybe not). They are in reverse chronological order btw (journalctl -ru output)
Mar 08 00:30:35 scb1003 pdfrender[30726]: AssertionError: display is not set! Mar 08 00:30:35 scb1003 pdfrender[30726]: File "xpra/x11/bindings/core_bindings.pyx", line 57, in xpra.x11.bindings.core_bindings.X11CoreBin Mar 08 00:30:35 scb1003 pdfrender[30726]: X11Window = X11WindowBindings() Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/x11/gtk_x11/prop.py", line 25, in <module> Mar 08 00:30:35 scb1003 pdfrender[30726]: from xpra.x11.gtk_x11.prop import prop_get, prop_set Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/client/gtk_base/gtk_client_window_base.py", line 33, i Mar 08 00:30:35 scb1003 pdfrender[30726]: from xpra.client.gtk_base.gtk_client_window_base import GTKClientWindowBase, HAS_X11_BINDINGS Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/client/gtk2/gtk2_window_base.py", line 15, in <module> Mar 08 00:30:35 scb1003 pdfrender[30726]: from xpra.client.gtk2.gtk2_window_base import GTK2WindowBase Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/client/gtk2/client_window.py", line 9, in <module> Mar 08 00:30:35 scb1003 pdfrender[30726]: from xpra.client.gtk2.client_window import ClientWindow Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/client/gtk2/border_client_window.py", line 10, in <mod Mar 08 00:30:35 scb1003 pdfrender[30726]: from xpra.client.gtk2.border_client_window import BorderClientWindow Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/client/gtk2/client.py", line 37, in <module> Mar 08 00:30:35 scb1003 pdfrender[30726]: toolkit_module = __import__(client_module, globals(), locals(), ['XpraClient']) Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/scripts/main.py", line 1174, in make_client Mar 08 00:30:35 scb1003 pdfrender[30726]: app = make_client(error_cb, opts) Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/scripts/main.py", line 1111, in run_client Mar 08 00:30:35 scb1003 pdfrender[30726]: return run_client(error_cb, options, args, mode) Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/scripts/main.py", line 761, in run_mode Mar 08 00:30:35 scb1003 pdfrender[30726]: return run_mode(script_file, err, options, args, mode, defaults) Mar 08 00:30:35 scb1003 pdfrender[30726]: File "/usr/lib/python2.7/dist-packages/xpra/scripts/main.py", line 103, in main Mar 08 00:30:35 scb1003 pdfrender[30726]: Traceback (most recent call last): Mar 08 00:30:35 scb1003 pdfrender[30726]: xpra main error: Mar 08 00:30:35 scb1003 pdfrender[30726]: 2017-03-08 00:30:35,085 failed load posix keyboard bindings: display is not set! Mar 08 00:30:34 scb1003 pdfrender[30726]: from xpra.x11.gtk_x11 import gdk_display_source Mar 08 00:30:34 scb1003 pdfrender[30726]: /usr/lib/python2.7/dist-packages/xpra/client/gtk2/__init__.py:7: GtkWarning: IA__gdk_screen_get_ro Mar 08 00:30:34 scb1003 pdfrender[30726]: warnings.warn(str(e), _gtk.Warning) Mar 08 00:30:34 scb1003 pdfrender[30726]: [199B blob data] Mar 08 00:30:34 scb1003 pdfrender[30726]: Warning: a protocol list is present, the new list "unix,inet,inet6" will not be installed Mar 08 00:30:34 scb1003 pdfrender[30726]: Reading profile /etc/firejail/disable-passwdmgr.inc Mar 08 00:30:34 scb1003 pdfrender[30726]: Reading profile /etc/firejail/disable-programs.inc Mar 08 00:30:34 scb1003 pdfrender[30726]: Reading profile /etc/firejail/disable-common.inc Mar 08 00:30:34 scb1003 pdfrender[30726]: Reading profile /etc/firejail/default.profile Mar 08 00:30:34 scb1003 pdfrender[30726]: Reading profile /etc/firejail/pdfrender.profile Mar 08 00:30:34 scb1003 pdfrender[30726]: 2017-03-08 00:30:34,264 xpra is ready. Mar 08 00:30:34 scb1003 pdfrender[30726]: 2017-03-08 00:30:34,251 running with pid 30728 Mar 08 00:30:34 scb1003 pdfrender[30726]: 2017-03-08 00:30:34,251 xpra server version 0.14.10 (r7983) Mar 08 00:30:34 scb1003 pdfrender[30726]: 2017-03-08 00:30:34,246 cannot load dbus helper: No module named dbus Mar 08 00:30:34 scb1003 pdfrender[30726]: 2017-03-08 00:30:34,168 server uuid is 8102737322954f3d80ea573401b3777e Mar 08 00:30:33 scb1003 pdfrender[30726]: (==) Using system config directory "/usr/share/X11/xorg.conf.d" Mar 08 00:30:33 scb1003 pdfrender[30726]: (++) Using config file: "/etc/xpra/xorg.conf" Mar 08 00:30:33 scb1003 pdfrender[30726]: (++) Log file: "/home/pdfrender/.xpra/Xorg.:427.log", Time: Wed Mar 8 00:30:33 2017 Mar 08 00:30:33 scb1003 pdfrender[30726]: (WW) warning, (EE) error, (NI) not implemented, (??) unknown. Mar 08 00:30:33 scb1003 pdfrender[30726]: (++) from command line, (!!) notice, (II) informational, Mar 08 00:30:33 scb1003 pdfrender[30726]: Markers: (--) probed, (**) from config file, (==) default setting, Mar 08 00:30:33 scb1003 pdfrender[30726]: to make sure that you have the latest version. Mar 08 00:30:33 scb1003 pdfrender[30726]: Before reporting problems, check http://wiki.x.org Mar 08 00:30:33 scb1003 pdfrender[30726]: Current version of pixman: 0.32.6 Mar 08 00:30:33 scb1003 pdfrender[30726]: xorg-server 2:1.16.4-1 (http://www.debian.org/support) Mar 08 00:30:33 scb1003 pdfrender[30726]: Build Date: 11 February 2015 12:32:02AM Mar 08 00:30:33 scb1003 pdfrender[30726]: Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-3-amd64 root=UUID=0a595999-dbe0-4882-b1f7-2040 Mar 08 00:30:33 scb1003 pdfrender[30726]: Current Operating System: Linux scb1003 4.4.0-3-amd64 #1 SMP Debian 4.4.2-3+wmf7 (2016-11-04) x86_ Mar 08 00:30:33 scb1003 pdfrender[30726]: Build Operating System: Linux 3.16.0-4-amd64 x86_64 Debian Mar 08 00:30:33 scb1003 pdfrender[30726]: X Protocol Version 11, Revision 0 Mar 08 00:30:33 scb1003 pdfrender[30726]: Release Date: 2014-12-20 Mar 08 00:30:33 scb1003 pdfrender[30726]: X.Org X Server 1.16.4 Mar 08 00:30:33 scb1003 pdfrender[30726]: Warning: a protocol list is present, the new list "unix,inet,inet6" will not be installed Mar 08 00:30:33 scb1003 pdfrender[30726]: Reading profile /etc/firejail/disable-passwdmgr.inc Mar 08 00:30:33 scb1003 pdfrender[30726]: Reading profile /etc/firejail/disable-programs.inc Mar 08 00:30:33 scb1003 pdfrender[30726]: Reading profile /etc/firejail/disable-common.inc Mar 08 00:30:33 scb1003 pdfrender[30726]: Reading profile /etc/firejail/default.profile Mar 08 00:30:33 scb1003 pdfrender[30726]: Reading profile /etc/firejail/pdfrender.profile Mar 08 00:30:33 scb1003 systemd[1]: Started "pdfrender service". Mar 08 00:30:33 scb1003 systemd[1]: Starting "pdfrender service"... Mar 08 00:30:32 scb1003 pdfrender[26738]: (EE) Server terminated successfully (0). Closing log file. Mar 08 00:30:32 scb1003 pdfrender[26738]: Xpra: Fatal IO error 2 (No such file or directory) on X server :711. Mar 08 00:30:32 scb1003 pdfrender[26738]: 2017-03-08 00:30:32,835 killing xvfb with pid 26742 Mar 08 00:30:32 scb1003 pdfrender[26738]: 2017-03-08 00:30:32,835 removing socket /home/pdfrender/.xpra/scb1003-711 Mar 08 00:30:32 scb1003 pdfrender[26738]: got deadly signal SIGTERM, exiting Mar 08 00:30:32 scb1003 pdfrender[26738]: 2017-03-08 00:30:32,835 got signal SIGTERM, exiting Mar 08 00:30:32 scb1003 pdfrender[26738]: 2017-03-08 00:30:32,835 Mar 08 00:30:32 scb1003 pdfrender[26738]: Parent is shutting down, bye... Mar 08 00:30:32 scb1003 pdfrender[26738]: Child received signal 15, shutting down the sandbox... Mar 08 00:30:32 scb1003 pdfrender[26738]: Parent received signal 15, shutting down the child process... Mar 08 00:30:32 scb1003 pdfrender[26738]: Parent pid 26821, child pid 26822 Mar 08 00:30:32 scb1003 systemd[1]: Stopping "pdfrender service"...
It's not clear as to why the TCP port has not been opened by the process. After some debugging, @Giuseppe and me think this will probably go away upon a restart, but we 've decided to open the phab task so this is known and debugged a bit more if needed.
Restart command
To restart all instances in eqiad, something like this works:
for i in 1 2 3 4;do echo $i; ssh scb100$i.eqiad.wmnet "sudo service pdfrender restart"; done