Quoth @CDanis:
> Several times a month, someone asks on Slack for help with runaway processes on a (sic) stat hosts. Usually, the system will be heavily overcommitted on RAM and stuck in a livelock spin cycle...
To improve the user experience when this happens, we added [[ https://github.com/facebookincubator/oomd | oomd ]] to the stat hosts [[ https://gerrit.wikimedia.org/r/c/operations/puppet/+/1203548 | via this Puppet change ]].
Specifically, the hope is that `oomd` will take care of killing processes automatically so that users do not need to interrupt their work to ping SREs when the hosts are under memory contention.
Creating this ticket to:
[] Attempt to trigger oomd on the stat hosts
[] Record results
[] Follow up as necessary (**if it works:** communicate the change to stat host users. **If it doesn't:** Decide whether or not to tweak oomd settings, or ignore it and wait until we've reimaged on a newer Debian OS and can use the newer and more-popular `systemd-oomd` for the same purpose).