Page MenuHomePhabricator

analytics1062 lost one of its power supplies
Closed, ResolvedPublic

Description

https://icinga.wikimedia.org/cgi-bin/icinga/extinfo.cgi?type=2&host=analytics1062&service=IPMI+Sensor+Status

Current Status: CRITICAL
(for 1d 7h 3m 43s)
Status Information: Sensor Type(s) Temperature, Power_Supply Status: Critical [PS Redundancy = Critical, Status = Critical]
Performance Data: 'Inlet Temp'=21.00;3.00:42.00;-7.00:47.00 'Exhaust Temp'=43.00;0.00:70.00;0.00:75.00 'Temp'=63.00 'Temp'=68.00
Current Attempt: 3/3 (HARD state)
Last Check Time: 2019-11-01 19:14:09

Event Timeline

Dzahn created this task.Nov 1 2019, 7:25 PM
Restricted Application added projects: Operations, Analytics. · View Herald TranscriptNov 1 2019, 7:25 PM
wiki_willy added subscribers: Jclark-ctr, wiki_willy.

@Jclark-ctr - looks like this one is from last Thursday's PDU upgrade. Can you check if it's maybe a loose cord? If not, we'll have to RMA it (server under warranty thru March 2020) . Thanks, Willy

fdans moved this task from Incoming to Radar on the Analytics board.Nov 4 2019, 4:46 PM

unfortunately not a loose power cord Submitted Tech direct ticket for replacement psu Service Request 1001998096

Received new psu and replaced.

Tracking for RMA

Jclark-ctr closed this task as Resolved.Nov 7 2019, 10:15 PM

Icinga downtime for 2:00:00 set by otto@cumin1001 on 1 host(s) and their services with reason: analytics1062 lost one of its power supplies

analytics1062.eqiad.wmnet
Jclark-ctr closed this task as Resolved.Nov 13 2019, 3:54 PM

alert cleared no errors in icinga