Page MenuHomePhabricator

Phase monitoring for new PDUs
Open, NormalPublic

Description

While working on T148541 I noticed that latest (i.e. those using the sentry4 SNMP MIB) PDUs installed in eqiad as part of T226778 are failing their phase monitoring checks, whereas ulsfo PDUs installed in T209101 are currently missing icinga phase monitoring checks (i.e. only ping checks)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 26 2019, 11:22 AM
faidon added a subscriber: faidon.Jul 26 2019, 11:35 AM

whereas ulsfo PDUs installed in T209101 are currently missing icinga phase monitoring checks (i.e. only ping checks)

Note that ulsfo does not have 3-phase power so it makes sense here to be different from eqiad/codfw. It would probably still make sense to have something to monitor that single phase, though.

herron triaged this task as Normal priority.Jul 26 2019, 4:26 PM
herron added a project: observability.
fgiunchedi moved this task from Backlog to Up next on the observability board.Aug 5 2019, 2:30 PM

Change 529790 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] facilities: introduce monitor_pdu_phase for ulsfo PDUs

https://gerrit.wikimedia.org/r/529790

Change 529791 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] prometheus: generate targets for single phase PDUs

https://gerrit.wikimedia.org/r/529791

Change 529790 merged by Filippo Giunchedi:
[operations/puppet@production] facilities: introduce monitor_pdu_phase for ulsfo PDUs

https://gerrit.wikimedia.org/r/529790

Change 529791 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: generate targets for single phase PDUs

https://gerrit.wikimedia.org/r/529791