Page MenuHomePhabricator

mw2259 down and mgmt does not exist?
Closed, ResolvedPublic

Description

mw2259 was reimaged as part of T239054

After reinstall of the OS the reimaging script rebooted it and it simply never came back from that reboot.

19:47:27 | mw2259.codfw.wmnet | Rebooted host
20:38:13 | mw2259.codfw.wmnet | Still waiting for reboot after 50.0 minutes

I tried to SSH to the mgmt interface to see what's up and i noticed i also can't reach that. It just times out.

ssh root@mw2259.mgmt.codfw.wmnet
...

It's down in Icinga:

https://phabricator.wikimedia.org/T239758

but there is no mw2259.mgmt in Icinga, as opposed to other appservers like mw2247. Why is that?

Event Timeline

Dzahn added a subscriber: Papaul.

@Papaul Could you take a look at this onsite?

Dzahn triaged this task as Medium priority.Dec 3 2019, 9:36 PM

Also: I noticed in Icinga there is no "mw2259.mgmt" (while for example mw2247.mgmt exists). It's simply not there:

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=mw2259

vs.

https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=mw2247

Why is this?

Dzahn renamed this task from mw2259 and mgmt down to mw2259 down and mgmt does not exist?.Dec 3 2019, 10:17 PM

Reset the IDRAC server is back up

Thank you very much @Papaul.

I can confirm the server is reachable again and everything looked fine in Icinga.

Also the mgmt interface is in Icinga again.

I just repooled the server, it's a jobrunner and back in production now.