As in the case of https://phabricator.wikimedia.org/T402594, the WikiWho service went down today, serving 500 errors for every request. This appears to be because the main disk was full again. I truncated some large logs, then did a soft reboot of the server, and it started working again.
I'm not sure what is taking up so much of the 20G root disk. The files I truncated were the biggest logs I spotted, the celery default worker logs. But there is still only 2.3G free on /, so I must be missing some big source of disk usage.