Page MenuHomePhabricator

wmf-auto-restart fails on certain legacy services
Closed, ResolvedPublic


By trying to run wmf-auto-restart --dry-run on all running services on pinkunicorn, I've noticed that the script fails with certain legacy services using systemd-sysv-generator:

$ sudo wmf-auto-restart --dry-run -s mcelog.service
Traceback (most recent call last):
  File "/usr/local/sbin/wmf-auto-restart", line 142, in <module>
  File "/usr/local/sbin/wmf-auto-restart", line 138, in main
    return check_restart(args.servicename, args.dryrun)
  File "/usr/local/sbin/wmf-auto-restart", line 59, in check_restart
    pid_query = subprocess.check_output(["/bin/pidof", service_name], universal_newlines=True)
  File "/usr/lib/python3.4/", line 620, in check_output
    raise CalledProcessError(retcode, process.args, output=output)
subprocess.CalledProcessError: Command '['/bin/pidof', 'mcelog.service']' returned non-zero exit status 1

For some reason systemd-sysv-generator sets GuessMainPID=no, which is too bad as the setting would probably help here (it defaults to yes).

Event Timeline

ema created this task.Dec 18 2018, 3:06 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 18 2018, 3:06 PM
ema triaged this task as Medium priority.Dec 18 2018, 3:06 PM

Change 480520 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Clarify expected format of service name in wmf-auto-restart

jbond added a subscriber: jbond.Mar 13 2019, 2:57 PM

Was this resolved with moritz's patch. if not i came across a similar issue and created the following patch

Ah, I completely forgot to merge it, it's, will do that later on

(This task is about a wrong service name passed to wmf-auto-restart)

jbond added a comment.Mar 13 2019, 3:02 PM

didn't notice it wasn't merged :)

Change 480520 merged by Muehlenhoff:
[operations/puppet@production] Clarify expected format of service name in wmf-auto-restart

MoritzMuehlenhoff closed this task as Resolved.Mar 15 2019, 10:07 AM
MoritzMuehlenhoff claimed this task.

Fix has been deployed,

Mentioned in SAL (#wikimedia-operations) [2019-04-26T17:50:08Z] <mutante> analytics1052 - reported broken systemd state in Icinga - service mcelog was in state failed - systemctl start mcelog - (T212219 ?)

Mentioned in SAL (#wikimedia-operations) [2019-04-30T02:23:29Z] <mutante> analytics1050 - systemctl start mclog ... it was failed like recently on analytics1052 (T212219 ?)