Page MenuHomePhabricator

Beta Cluster is down due to restart of WMF Labs servers
Closed, ResolvedPublic

Description

All of the Beta Cluster is unavailable (along with much of Toollabs)

This was caused by the restart of all WMF Labs servers due to the libc security vulnerability that was released today.

Event Timeline

Ryasmeen raised the priority of this task from to Needs Triage.
Ryasmeen updated the task description. (Show Details)
Ryasmeen subscribed.
greg renamed this task from Betalabs is down to Beta Cluster is down due to restart of WMF Labs servers.Jan 27 2015, 9:05 PM
greg set Security to None.
greg added subscribers: greg, Andrew, yuvipanda.

@Andrew / @yuvipanda: I assume this is due to the restart of Labs infra due to the security release this morning, yes?

greg triaged this task as Unbreak Now! priority.Jan 27 2015, 9:17 PM
greg moved this task from To Triage to In-progress on the Beta-Cluster-Infrastructure board.
greg added subscribers: Reedy, mmodell.

@Reedy is looking into this now (discussion in #wikimedia-releng)

@mmodell: if you come to the big room and sit next to reedy (on the left of the stage) we might be able to get this back up before we leave the venue.

Seems it's likely because nginx isn't running/won't start

root@deployment-cache-text02:/home/reedy# service nginx status
 * nginx is not running
Starting nginx: nginx: [emerg] SSL_CTX_use_PrivateKey_file("/etc/ssl/private/star.wmflabs.org.key") failed (SSL: error:0B080074:x509 certificate routines:X509_check_private_key:key values mismatch)
nginx: configuration file /etc/nginx/nginx.conf test failed

Getting this error now:
(Cannot access the database: Can't connect to MySQL server on '10.68.16.193' (111) (10.68.16.193))

Things seem to be working again, no?

Thanks @Dzahn, @chad, @Reedy, etc. Fun day with yet another exploit announced during a group meeting.

Its not working yet though, I am still getting a different kind of error:
Error loading data from server:0: parsoidserver-http:HTTP:0, would you like to retry when I try to load VE.Should I raise it as separate issue?

@Ryasmeen the parsoid cache machine was in virt1009 which was down. Is back up now, can you check / verify if it still works?

@yuvipanda: Yup, it is loading now,Thanks so much!

Restricted Application added subscribers: Jay8g, TerraCodes. · View Herald Transcript