Page MenuHomePhabricator

Beta-Cluster shows randomly white pages
Closed, ResolvedPublic

Description

If you load a page at beta, there is currently a chance, that beta shows a white page. Reloading helps, but this is very annoying, and if it's not betas architecture rather the code, it's a big problem.



An example white page, happens for example at AbuseFilter Pages or other pages too.

Event Timeline

Restricted Application added subscribers: TerraCodes, Aklapper. · View Herald TranscriptApr 20 2016, 1:43 PM
Luke081515 triaged this task as High priority.Apr 20 2016, 1:43 PM
Luke081515 raised the priority of this task from High to Unbreak Now!.Apr 20 2016, 1:47 PM

This happens very often, round about 2 of 3 pages you load are white. You can't do any actions there without realoding the pages very often.

Restricted Application added a subscriber: Urbanecm. · View Herald TranscriptApr 20 2016, 1:47 PM
Luke081515 updated the task description. (Show Details)Apr 20 2016, 1:54 PM
Se4598 added a subscriber: Se4598.Apr 20 2016, 2:27 PM

A quick test by repeating the request suggests that deployment-mediawiki01.deployment-prep.eqiad.wmflabs delivers for Special:Radom a blank page and deployment-mediawiki02.deployment-prep.eqiad.wmflabs correctly redirects.

Request:
GET /wiki/Spezial:Zuf%C3%A4llige_Seite HTTP/1.1
Host: de.wikipedia.beta.wmflabs.org
[...]

Response: Expected 302 Found, but got
HTTP/1.1 200 OK
Server: deployment-mediawiki01.deployment-prep.eqiad.wmflabs
X-Powered-By: HHVM/3.12.1
x-content-type-options: nosniff
P3P: CP="This is not a P3P policy! See http://de.wikipedia.beta.wmflabs.org/wiki/Spezial:CentralAutoLogin/P3P for more info."
Content-Encoding: gzip
Vary: Accept-Encoding
backend-timing: D=34536 t=1461161837417512
Content-Type: text/html
X-Varnish: 578986392, 2095227971
Via: 1.1 varnish, 1.1 varnish
Transfer-Encoding: chunked
Date: Wed, 20 Apr 2016 14:17:17 GMT
Age: 0
Connection: keep-alive
X-Cache: deployment-cache-text04 miss+chfp(0), deployment-cache-text04 frontend miss+chfp(0)
Set-Cookie: WMF-Last-Access=20-Apr-2016;Path=/;HttpOnly;secure;Expires=Sun, 22 May 2016 12:00:00 GMT
x-client-ip: XXX
Cache-Control: private, s-maxage=0, max-age=0, must-revalidate
Krenair claimed this task.Apr 20 2016, 2:55 PM
Krenair added a subscriber: Krenair.

-mediawiki01 and -mediawiki03 broke around 9AM BST

Krenair closed this task as Resolved.Apr 20 2016, 2:57 PM
<shinken-wm> RECOVERY - App Server Main HTTP Response on deployment-mediawiki01 is OK: HTTP OK: HTTP/1.1 200 OK - 40318 bytes in 2.056 second response time
<shinken-wm> RECOVERY - App Server Main HTTP Response on deployment-mediawiki03 is OK: HTTP OK: HTTP/1.1 200 OK - 40326 bytes in 2.500 second response time

(I restarted HHVM)