mw1236 powered down and not able to powerup
Closed, ResolvedPublic

Description

Hi!

mw1236 went down this morning at around 07:30 CEST and I wasn't able to power it up again:

/admin1-> racadm serveraction powerup
ERROR: Timeout while waiting for server to perform requested power action.

No idea if there is another trick to use before checking in the DC, open for suggestions :)

elukey created this task.Jan 30 2017, 8:29 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 30 2017, 8:29 AM
elukey triaged this task as Normal priority.Jan 30 2017, 8:30 AM
Volans added a subscriber: Volans.Jan 31 2017, 10:58 AM

@elukey: FYI icinga downtime expired, I've set it to downtime for a week, just in case

elukey added a comment.Feb 6 2017, 9:09 AM

Extended the downtime to prevent spurious notifications in IRC.

@elukey Both PSU's must have taken a spike because they both were off. I had to reseat the PSU"s and drain any flea power. Once I did this plugged the psu back into the server, the server powered on. Oddly enough, the initial event was not captured in the racadm system log.

Mentioned in SAL (#wikimedia-operations) [2017-02-11T09:35:12Z] <elukey> rebooting mw1236 to make sure that it comes up cleanly - T156610

Mentioned in SAL (#wikimedia-operations) [2017-02-11T09:53:33Z] <elukey> mw1236 back in production (scap pull executed before pooled=yes) - T156610

elukey closed this task as Resolved.Feb 11 2017, 9:54 AM

Thanks @Cmjohnson!!