Looks like part of codfw rack A1 lost power, which took msw-a1-codfw down.
Description
Description
Details
Details
Project | Branch | Lines +/- | Subject | |
---|---|---|---|---|
operations/puppet | production | +4 -3 | Add new model for new PDU in rack A1 |
Event Timeline
Comment Actions
Surprisingly both msw1-codfw PSUs are ON:
msw1-codfw> show chassis environment Class Item Status Measurement Power FPC 0 Power Supply 0 OK FPC 0 Power Supply 1 OK
But for example:
cr1-codfw> show system alarms 6 alarms currently active Alarm time Class Description 2022-03-14 08:03:58 UTC Major Host 1 fxp0 : Ethernet Link Down 2022-03-14 08:03:48 UTC Major Host 0 fxp0 : Ethernet Link Down 2022-03-14 08:03:43 UTC Major PEM 2 Input Failure 2022-03-14 08:03:43 UTC Major PEM 2 Not OK 2022-03-14 08:03:43 UTC Major PEM 1 Input Failure 2022-03-14 08:03:43 UTC Major PEM 1 Not OK
and
asw-a-codfw> show system alarms 1 alarms currently active Alarm time Class Description 2022-03-14 08:03:46 UTC Major FPC 1 PEM 0 is not powered
So it's maybe half a PDU that lost power?
Comment Actions
Change 808048 had a related patch set uploaded (by Papaul; author: Papaul):
[operations/puppet@production] Add new model for new PDU in rack A1
Comment Actions
Change 808048 merged by Papaul:
[operations/puppet@production] Add new model for new PDU in rack A1