Page MenuHomePhabricator

graphite.wikimedia.org 503s on some css/js resources
Closed, ResolvedPublic

Description

reported on IRC, it looks like graphite.wikimedia.org 503s for some resources, in particular javascript, so that the "graphite composer" at https://graphite.wikimedia.org doesn't load:

e.g.
https://graphite.wikimedia.org/content/js/ext/ext-all.js

* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 91.198.174.217...
*   Trying 2620:0:862:ed1a::3:d...
* Immediate connect fail for 2620:0:862:ed1a::3:d: Network is unreachable
* Connected to graphite.wikimedia.org (91.198.174.217) port 443 (#0)
* successfully set certificate verify locations:
*   CAfile: none
  CApath: /etc/ssl/certs
* SSLv3, TLS handshake, Client hello (1):
} [data not shown]
* SSLv3, TLS handshake, Server hello (2):
{ [data not shown]
* SSLv3, TLS handshake, CERT (11):
{ [data not shown]
* SSLv3, TLS handshake, Server key exchange (12):
{ [data not shown]
* SSLv3, TLS handshake, Server finished (14):
{ [data not shown]
* SSLv3, TLS handshake, Client key exchange (16):
} [data not shown]
* SSLv3, TLS change cipher, Client hello (1):
} [data not shown]
* SSLv3, TLS handshake, Finished (20):
} [data not shown]
* SSLv3, TLS change cipher, Client hello (1):
{ [data not shown]
* SSLv3, TLS handshake, Finished (20):
{ [data not shown]
* SSL connection using TLSv1.2 / ECDHE-ECDSA-AES128-GCM-SHA256
* Server certificate:
* 	 subject: C=US; ST=California; L=San Francisco; O=Wikimedia Foundation, Inc.; CN=*.wikipedia.org
* 	 start date: 2015-12-10 23:22:05 GMT
* 	 expire date: 2016-12-10 22:46:04 GMT
* 	 subjectAltName: graphite.wikimedia.org matched
* 	 issuer: C=BE; O=GlobalSign nv-sa; CN=GlobalSign Organization Validation CA - SHA256 - G2
* 	 SSL certificate verify ok.
> GET /content/js/ext/ext-all.js HTTP/1.1
> User-Agent: curl/7.38.0
> Accept: */*
> Host: graphite.wikimedia.org
> 
< HTTP/1.1 503 Backend fetch failed
< Date: Tue, 17 May 2016 15:28:49 GMT
< Content-Type: text/html; charset=utf-8
< Content-Length: 1622
< Connection: keep-alive
* Server Varnish is not blacklisted
< Server: Varnish
< X-Varnish: 1244569, 52460309, 37487048
< Age: 0
< Via: 1.1 varnish-v4, 1.1 varnish-v4, 1.1 varnish-v4
< Vary: Accept-Encoding
< Age: 0
< Age: 0
< X-Cache: cp1058 miss, cp3008 miss, cp3008 miss
< Strict-Transport-Security: max-age=31536000
< Set-Cookie: WMF-Last-Access=17-May-2016;Path=/;HttpOnly;secure;Expires=Sat, 18 Jun 2016 12:00:00 GMT
< X-Analytics: https=1;nocookies=1
< 
{ [data not shown]
100  1622  100  1622    0     0   2875      0 --:--:-- --:--:-- --:--:--  2870
* Connection #0 to host graphite.wikimedia.org left intact

from inside the network and from graphite1001 I'm getting 200s for the same resource (Authorization: will be required tho)

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptMay 17 2016, 3:48 PM

Mentioned in SAL [2016-05-17T16:23:10Z] <godog> disable mod_deflate and restart apache2 on graphite1001 T135515

ema added a subscriber: ema.May 17 2016, 5:55 PM

Change 289254 had a related patch set uploaded (by Ema):
4.1.2-1wm6: update 0005-handle-eof-http1.1.patch

https://gerrit.wikimedia.org/r/289254

Change 289254 merged by Ema:
4.1.2-1wm6: update 0005-handle-eof-http1.1.patch

https://gerrit.wikimedia.org/r/289254

BBlack triaged this task as Normal priority.May 17 2016, 8:08 PM
BBlack added a subscriber: BBlack.

Misc-cluster updated to 4.1.2-1wm6, patch above seems to fix (along with the mod_deflate disable). We may need further commits to make the mod_deflate disable permanent on graphite, and maybe look around and see if we similar issues elsewhere (can we kill mod_deflate universally? I don't think we ever actually want deflate compression do we?)

fgiunchedi closed this task as Resolved.May 18 2016, 8:36 AM
fgiunchedi claimed this task.
fgiunchedi added a subscriber: elukey.

confirmed this is fixed on the graphite side, thanks @BBlack @ema !

re: mod_deflate, @elukey was mentioning that despite the name on apache 2.4 mod_deflate handles gzip while deflate isn't supported (cfr http://httpd.apache.org/docs/current/mod/mod_deflate.html) in any case, the issue we were seeing (responses from apache were binary data, not even with headers) might be related to mod_deflate + mod_wsgi and static assets, otherwise I suspect we'd be seeing this issue more often given how many apache we have deployed, I've reported it as T135595