Debugging on IRC:
<akosiaris> so this time around it's /etc/passwd that's locked <akosiaris> not /etc/passwd+ <akosiaris> https://github.com/netblue30/firejail/issues/559 <akosiaris> ? <akosiaris> moritzm: ^ might be this ? <moritzm> akosiaris: good catch, sca are the only trustys with 3.13 <akosiaris> ok that was it <akosiaris> well kind of <moritzm> and there's been the firejail update to 0.9.40 afer Madhu joined (for the bug in exitcode handling) <akosiaris> now I get Error: /usr/bin/gpasswd ops -M ... returned 1 instead of one of  <akosiaris> rename("/etc/group+", "/etc/group") = -1 EBUSY (Device or resource busy) <akosiaris> so that's the one I had met last time
I 've done a
service zotero stop puppet agent -t -v service zotero stop /usr/bin/gpasswd ops -M filippo,jgreen,bblack,andrew,faidon,rush,oblivian,laner,yuvipanda,dzahn,akosiaris,springle,mark,ariel,cmjohnson,otto,robh,tstarling,ori,midom,jmm,jynus,aaron,ema,elukey,gehel,volans,madhuvishy,marostegui puppet agent -t -v
dance a get the user applied on sca1001, sca1002, sca2001. I 've left sca2002 as is so we can debug this further. sca2002 has been ACKed in icinga by @jcrespo
The locking limitation is fixed in Linux 3.18:
The sca cluster is currently the only cluster with long-running firejail processes using a kernel < 3.18; trusty uses 3.13. scb and the image scalers are running on jessie with Linux 4.4.
(There's one cornercase where this also applies to the standard app servers; the Score extension for creating musical typesheets has a code path which triggers a scaling operation using imagemagick and that conversion is also guarded by firejail. However, such invocations are fairly rare to begin with (IIRC about 100 per week for the entire cluster) and this does only git when making changes to privileged files. Since the mw* cluster are being reimaged to jessie anyway, I'll ignore this.
Alex mentioned that he intends to the sca cluster to jessie in the foreseeable future anyway, so my suggestion is to migrate the sca* cluster to the current HWE kernel: Ubuntu provides backports of the xenial/16.04 kernel to trusty which are officially supported. This solves the problem without making sca-specific tweaks to the firejail config or downgrading to an older firejail version (which is affected by the exitcode passthrough bug anyway).