Page MenuHomePhabricator

Monitor mailman3 runner processes
Closed, ResolvedPublic

Description

In investigating T282348 I noticed that the bounces runner died. Given that we monitor the in/virgin/bounces queue length, we should also monitor the runner.

when happy, it should look like /usr/bin/python3 /usr/lib/mailman3/bin/runner -C /etc/mailman3/mailman.cfg --runner=bounces:0:1

Event Timeline

Change 687741 had a related patch set uploaded (by Legoktm; author: Legoktm):

[operations/puppet@production] mailman3: Monitor runners

https://gerrit.wikimedia.org/r/687741

Change 687741 merged by Legoktm:

[operations/puppet@production] mailman3: Monitor runners

https://gerrit.wikimedia.org/r/687741

Legoktm claimed this task.
14:45:23 <+icinga-wm> PROBLEM - mailman3_runners on lists1001 is CRITICAL: PROCS CRITICAL: 13 processes with UID = 38 (list), regex args /usr/lib/mailman3/bin/runner https://wikitech.wikimedia.org/wiki/Mailman/Monitoring
14:48:47 <+icinga-wm> RECOVERY - mailman3_runners on lists1001 is OK: PROCS OK: 14 processes with UID = 38 (list), regex args /usr/lib/mailman3/bin/runner https://wikitech.wikimedia.org/wiki/Mailman/Monitoring