Page MenuHomePhabricator

kakfa-jumbo1008 psu redundacy fail
Closed, ResolvedPublic

Description

We've recently received an alert via Icinga that jumbo1008.eqiad.wmnet has PSU trouble:

Sensor Type(s) Temperature, Power_Supply Status: Critical [PS Redundancy = Critical, Status = Critical]

Sensor Type : POWER
<Sensor Name>                   <Status>                 <Type>         
PS1 Status                      AC-Lost                  AC             
PS2 Status                      Present                  AC

Icinga Check: https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=kafka-jumbo1008&service=IPMI+Sensor+Status

Once this has been reseated and the icinga check cleared, this can be resolved.

Event Timeline

klausman created this task.Sep 18 2020, 2:24 PM
Restricted Application added a project: Operations. · View Herald TranscriptSep 18 2020, 2:24 PM
klausman renamed this task from Check jumbo1008.eqiad.wmnet PSU setup to Check jumbo1008.eqiad.wmnet PSU redundancy reported as critical.Sep 18 2020, 2:24 PM
RobH added a subscriber: RobH.

So the hostname is actually kafka-jumbo1008, updating task with the relevant details.

RobH renamed this task from Check jumbo1008.eqiad.wmnet PSU redundancy reported as critical to kakfa-jumbo1008 psu redundacy fail.Sep 22 2020, 4:29 PM
RobH updated the task description. (Show Details)
Cmjohnson closed this task as Resolved.Sep 24 2020, 1:53 PM

power cable was not properly seated...corrected it

Record: 19
Date/Time: 09/24/2020 13:52:49
Source: system
Severity: Ok
Description: The power supplies are redundant.