Page MenuHomePhabricator

hw troubleshooting: SSH failure for wdqs2001.mgmt.codfw.wmnet
Closed, ResolvedPublicRequest

Description

  • FQDN: wdqs2001.mgmt.codfw.wmnet
  • Machine de-pooled, begin work whenever
  • - Put system into a failed state in Netbox.
  • Urgency: High-Medium (the underlying host is working fine but we lack access to mgmt port incase anything goes wrong)
  • Issue: ssh alert flapping for mgmt console specifically: wdqs2001.mgmt/SSH is CRITICAL
  • - Assign correct project tag and appropriate owner (based on above). Also, please ensure the service owners of the host(s) are added as subscribers to provide any additional input.

ryankemper@wdqs2001:~$ sudo ipmi-sel
ID  | Date        | Time     | Name             | Type                     | Event
1   | Aug-15-2016 | 20:52:14 | SEL              | Event Logging Disabled   | Log Area Reset/Cleared
2   | Aug-22-2016 | 10:11:18 | PS Redundancy    | Power Supply             | Fully Redundant

Event Timeline

Mentioned in SAL (#wikimedia-operations) [2021-06-17T18:24:08Z] <ryankemper> T285106 [WDQS] ryankemper@wdqs2001:~$ sudo depool

Upgrade BIOS and IDRAC, SSH is back working