Page MenuHomePhabricator

icinga alert cleanup for power switches in eqiad
Closed, ResolvedPublic

Description

There are some Icinga alerts because power switches in eqiad are down.

It is known that there is currenly ongoing maintenance but before just silently ACKing them i wanted to make a ticket to link to so it can be cleaned up.

It will need puppet changes to get these hosts out of puppetdb to get them out of Icinga.

currently alerting, ACKing with this ticket: ps-1-d3-eqiad, ps1-d4-eqiad

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=1&host=ps1-d3-eqiad

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=1&host=ps1-d4-eqiad

01:50 <+icinga-wm> ACKNOWLEDGEMENT - Host ps1-d3-eqiad is DOWN: PING CRITICAL - Packet loss = 100% daniel_zahn 
                   https://phabricator.wikimedia.org/T262629
01:50 <+icinga-wm> ACKNOWLEDGEMENT - Host ps1-d4-eqiad is DOWN: PING CRITICAL - Packet loss = 100% daniel_zahn 
                   https://phabricator.wikimedia.org/T262629

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2020-09-11T01:53:56Z] <mutante> ACKed alerts for eqiad power switches after making T262629

RobH claimed this task.
RobH subscribed.

im working on clearing all the alerts via the racking task T261452