Page MenuHomePhabricator

labtestcontrol2003 - UNKNOWN power supply status
Closed, ResolvedPublic

Description

Current Status: UNKNOWN
(for 27d 8h 53m 13s)
Status Information: Sensor Type(s) Temperature, Power_Supply Status:
FreeIPMI returned an empty header map (first line) FreeIPMI could not find any sensors for the given sensor type (option '-T').

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=labtestcontrol2003&service=IPMI+Sensor+Status


Note how it does not say the power supply is bad, it says it could not find any sensor for it.

That seems unusual.. it seems all other hosts have this sensor. Or maybe the check itself needs to be adjusted to handle this case.

Event Timeline

Dzahn created this task.Apr 12 2019, 12:32 AM

Mentioned in SAL (#wikimedia-operations) [2019-04-12T09:30:34Z] <volans> reset mgmt card on labtestcontrol2003 - T220783

Volans closed this task as Resolved.Apr 12 2019, 9:33 AM
Volans claimed this task.
Volans triaged this task as Normal priority.
Volans added a subscriber: Volans.

I've reset the mgmt card (see https://wikitech.wikimedia.org/wiki/Management_Interfaces#Reset_the_management_card ), wait that it rebooted, run ipmi-sensors that told me that the cache was outdated and needed a flush, run ipmi-sensors -f and we where good to go.
Sensors are working again, and Icinga is happy, resolving.