Page MenuHomePhabricator

Fix alerting for disk space on the NFS servers
Closed, ResolvedPublic

Description

Right now, /srv/tools is at 91%, but icinga reports a warning for 80%. This needs to be fixed somehow.
I don't think I've ever been paged by the setup it has for the right amount.

Event Timeline

Bstorm created this task.

So the reason alerting is not useful to us for NFS volumes is that disk alerting is assumed to be part of host monitoring. That's great except it doesn't make sense for NFS volumes due to their scale (3% free means something very different on a multi-user 9 TB volume than it does on a 20 GB root vol).

I'm finishing up a monitoring class that should fix that.

Change 622655 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] shared-storage: add specific NFS volume monitoring for cleanups

https://gerrit.wikimedia.org/r/622655

Change 622655 merged by Bstorm:
[operations/puppet@production] shared-storage: add specific NFS volume monitoring for cleanups

https://gerrit.wikimedia.org/r/622655

Change 622877 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] shared-storage: add specific NFS volume monitoring for cleanups

https://gerrit.wikimedia.org/r/622877

Change 622877 merged by Bstorm:
[operations/puppet@production] shared-storage: add specific NFS volume monitoring for cleanups

https://gerrit.wikimedia.org/r/622877

This should be fixed now