https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=dataset1001&service=Disk+space
Ariel says this is not really critical and has enough disk space..
..but we should fix the check instead to not become CRIT for this host
checking for percentage of disk left gets you this..
also.. making a ticket just so we can ACK it in Icinga
Description
Details
- Reference
- rt7922
Event Timeline
On Fri Jul 18 16:29:57 2014, dzahn wrote:
https://icinga.wikimedia.org/cgi-
bin/icinga/extinfo.cgi?type=2&host=dataset1001&service=Disk+spaceAriel says this is not really critical and has enough disk space..
..but we should fix the check instead to not become CRIT for this host
checking for percentage of disk left gets you this..
also.. making a ticket just so we can ACK it in Icinga
what mount point ran out of disk space btw? I can't find any recent alerts for
that
dataset1001:~$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 111G 7.7G 97G 8% /
udev 7.9G 4.0K 7.9G 1% /dev
tmpfs 3.2G 2.2M 3.2G 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 7.9G 0 7.9G 0% /run/shm
/dev/mapper/vg0-lv0 37T 34T 3.1T 92% /data
labstore1003.eqiad.wmnet:/dumps 44T 8.8T 35T 20% /mnt/dumps
On Thu Sep 18 10:41:10 2014, fgiunchedi wrote:
what mount point ran out of disk space btw? I can't find any recent
Yea, true, it's a bit unfortunate that Icinga always forgets the history or we
could look it up now. I _think_ it was this:
/dev/mapper/vg0-lv0 37T 34T 3.1T 92% /data
and it always triggered because we are checking for a percentage of space being
left and this one is so large. so 5% of 37T is still quite some space for
example.
the ticket was for adjusting that somehow, but i think you can close it for now
I keep bumping into dataset1001 alerts every time I look at icinga, so it looks like this isn't resolved; reopening.
Ariel has previously said that it's a known problem, probably referring to this ticket :) Ariel, can you clarify and/or have a look?
it's because our default disk check checks for a percentage of space left, and in the case of datasets, even a few percent are quite a bit of space:
for example, 1.5T free are still 97% full and causes it to trigger
https://gerrit.wikimedia.org/r/#/c/193834/ for this bug (would allow other custom checks for e.g. mariadb as well)
https://gerrit.wikimedia.org/r/#/c/193834/ got merged 18 months ago and this task has not seen any updates for 30 months. Still valid? Or resolved?
@Aklapper thanks, resolved :) (and made public, NDA was just because this was an RT import, yea, that old)