Page MenuHomePhabricator

Various instances unresponsive in "ACTIVE" (previously: "ERROR") state
Closed, ResolvedPublic

Description

The instances:

tools-exec-03
tools-exec-09
tools-submit
tools-webgrid-04
tools-webgrid-tomcat
tools-webproxy
tools-webproxy-01
tools-webproxy-02

were in state "ERROR (suspending)" at https://wikitech.wikimedia.org/wiki/Special:NovaInstance, tools-exec-cyberbot and tools-exec-07 in "ERROR (ok)".

Now they show up as "ACTIVE", however ssh still does not work:

[tim@passepartout ~]$ ssh tools-submit.eqiad.wmflabs
ssh_exchange_identification: Connection closed by remote host
[tim@passepartout ~]$

The instances seem to reside on different virtual nodes.

Event Timeline

scfc raised the priority of this task from to Unbreak Now!.
scfc updated the task description. (Show Details)
scfc added subscribers: scfc, yuvipanda, coren, Andrew.
coren claimed this task.

Known, immediately signaled on IRC, and being fixed. One of the hosts got ill.