Got this email from LibreNMS and it reports:
Uptime 10 minutes 40 seconds
I think next steps are:
- check on the PDU's UI if it really rebooted
- check if there are any logs on why
- if not a false positive decide if we should replace it
Got this email from LibreNMS and it reports:
Uptime 10 minutes 40 seconds
I think next steps are:
Still from LibreNMS:
2020-02-13 15:46:52 notice ps1-a8-codfw SENTRY3_5179AF] EVENT: System boot complete notice
2020-02-13 15:46:52 notice ps1-a8-codfw NO MATCH [Sentry3_5179af] EVENT: TCP/IP stack has started notice
2020-02-13 15:47:30 Device status changed to Up from icmp check.
2020-02-13 15:47:30 Device rebooted after 357 days 19 hours 37 minutes 2 seconds -> 43s
2020-02-13 15:37:30 Device status changed to Down from icmp check.
So doesn't look like a false positive.
Ok, it was firmware
Sentry Switched CDU Version 7.1b
and is now upgraded to firmware
Sentry Switched CDU Version 7.1d
This caused the PDU interface to reboot a second time, but does NOT affect power outlets.
This does NOT clear this PDU from being suspect. As one of our two network racks, this is likely due for upgrade/refresh ahead of other racks.
Mentioned in SAL (#wikimedia-operations) [2020-02-13T16:32:44Z] <robh> ps1-a8-codfw.mgmt.codfw.wmnet firmware upgraded via T245164
So this came back after my firmware update, and I logged in, but then I logged out after looking that firmware updated. Then Arzhel pointed out it wasn't showing online in librenms, and I go to login a second time, and it doesn't work.
I'm not sure what is up with this PDU. We may want to look at replacing it. The next troubleshooting step could be to do the following:
Please note this work is taking place in a networking rack, and thus needs to have its maint window cleared with @ayounsi!
In parallel, we may want to replace this entirely.
Mentioned in SAL (#wikimedia-operations) [2020-02-18T17:00:37Z] <papaul> restting ps1-a8-codfw see T245164
@wiki_willy ps1-a8 is not stable to still in production. I notice that there is a clicking noise coming from the PDU and the readings are not stable it keeps flapping.
option1: buy a replacement
option2: i have 1 old one in storage that i can replace with
Please advance .
thanks
@Papaul - if the spare one in storage is the same one, I think we can try replacing it with that first. Thanks, Willy
I open a request ticket (TICKET NO.1578279) with CY1 to assistance me on unplugging the old PDU and plugging the new one tomorrow the 19th at 10:30 Dallas time