I found some alarms this morning in #wikimedia-releng IRC channel (times are in UTC):
02:16:23 <icinga-wm> PROBLEM - PHD should be supervising processes on phab1001 is CRITICAL: PROCS CRITICAL: 2 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator 02:18:45 <icinga-wm> RECOVERY - PHD should be supervising processes on phab1001 is OK: PROCS OK: 4 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator 02:39:37 <icinga-wm> PROBLEM - PHD should be supervising processes on phab1001 is CRITICAL: PROCS CRITICAL: 2 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator 02:46:41 <icinga-wm> RECOVERY - PHD should be supervising processes on phab1001 is OK: PROCS OK: 11 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator
04:46:41 <icinga-wm> PROBLEM - PHD should be supervising processes on phab1001 is CRITICAL: PROCS CRITICAL: 2 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator 04:49:03 <icinga-wm> RECOVERY - PHD should be supervising processes on phab1001 is OK: PROCS OK: 4 processes with UID = 497 (phd) https://wikitech.wikimedia.org/wiki/Phabricator
I am tempted to rule out the monitoring command which has been the same since 2015:
/usr/lib/nagios/plugins/check_procs -c 3:150 -u phd
Which issues a critical state when there are less than 3 processes owned by the phd user.