Page MenuHomePhabricator

No file system on toollabs, unable to login, web service broken
Closed, ResolvedPublic

Description

multichill@tools-bastion-01:~/queries/wikidata$ ls
(nothing happens)

ssh tools-login.wmflabs.org

(just times out)

Web service on http://tools.wmflabs.org/ also broken. It gives 500 Internal Server Error (currently replaced by a "Our servers are currently experiencing a technical problem. " placeholder)

Event Timeline

Multichill raised the priority of this task from to Unbreak Now!.
Multichill updated the task description. (Show Details)
Multichill added a project: Toolforge.
Multichill subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The NFS server is down due to some kernel issues, we're working on it.

The last successful backup of tools is from 2015-08-30T01:59:35.787Z so at least we have a very recent backup

Multichill renamed this task from No file system on toollabs, unable to login to No file system on toollabs, unable to login, web service broken.Aug 30 2015, 11:00 AM
Multichill updated the task description. (Show Details)

I wonder why you haven't a redundant NFS server system not yet.

Romaine rescinded a token.
Romaine awarded a token.

The NFS server and tool labs are back online.

Tool operators are getting a lot of failed jobs emails. All because of the outage. Emails should all have a timestamp that falls in the outage period.

valhallasw claimed this task.

The initial issue was resolved sunday afternoon (CEST), but I forgot to close the task at that point.