Page MenuHomePhabricator

dns5002 mgmt console unreachable
Closed, ResolvedPublic

Description

I've brought up bast5001 and dns5001 successfully so far, but I can't even get a remote console on dns5002. I've checked from mr1 and the management IPs of all the rest of the hosts are at least pingable (so I presume console will be reachable), but the one for dns5002 (10.132.129.9) isn't even pingable. Ideas on how to move forward with fixing this one? It probably doesn't block bringing up the others for now, since it's a redundant pair with dns5001, but we'd probably rather not go live with it MIA, either.

Event Timeline

BBlack triaged this task as High priority.Feb 9 2018, 3:56 PM
BBlack created this task.

Draft email to equinix singapore smarthands directions:

Support,

We're unable to access one of our systems remotely, named dns5002, in rack 06:040020:0604, U 29. This means that either the cable has come unplugged, or the idrac interface has locked up.

Please check the following:

  1. Ensure that the green patch cable labeled 1028 is plugged into the netgear switch in the top of the rack. This switch should be labeled msw2. The other end of this cable should plug into the dedicated idrac port on dns5002. If it is plugged in, but there is no link light, then the idrac interface is likely locked up. Please attempt step 2.
  1. We require smarthands to fully power reset (via removal of power cords) one of our servers, labeled dns5002, in rack 06:040020:0604, U 29. The idrac interface has locked up. Please remove both power cables to fully de-energize the host for 30 seconds, and then plug them back in. This should reset the idrac interface for our remote accessibility. Please wait for about 3 minutes for the system to power the idrac interface up, and you should then see a link light on the idrac enterprise port on the back of dns5002.
  1. If no link light appears, please try moving the cable to a different port on the netgear switch. If that still doesn't work, please replace the cable labled 1028 with another green patch cable (there should be some on top of our servers in our racks, if no green is available, use blue and please let us know which is used so we can update our records.) If the new cable doesn't have a # on it, please use one of the spare # labels left in our racks. If there are no spare labels, please apply a new one with a lable printer, and use # 1200.

Please update rhalsell@wikimedia.org, bblack@wikimedia.org, and ayounsi@wikimedia.org with updates to this smarthands request.

That seems to cover what we need, right? If it fails after ALL of that, then it means the idrac is really messed up. It should link light even without ip being setup, so I'm asking they check that specifically as well. If we get a link light, but cannot ping, then we'll have to followup with directions on how to program the idrac with a temp password.

Sounds about right to me. But let's do the other two in T187158 and T187157 as well and maybe get more value out of the time. cp5006 and cp5010 both have "working" management consoles, but one needs a hard power reset to fixup host power-control issues, and the other needs its primary ethernet (to asw1) checked out (possible SFP re-seat or faulty SFP).

BBlack assigned this task to Papaul.
BBlack added a subscriber: Papaul.

@Papaul re-seated mgmt console cable, seems to be working now