Page MenuHomePhabricator

Graphoid returns a 400 on MW API time-out
Open, HighPublic

Description

When Graphoid receives a time-out from the MW API, it returns a 400 error, when in fact it should return a 504. The request and the reponse headers:

mobrovac@scb2001:~$ curl -v localhost:19000/en.wikipedia.org/v1/png/User:Pchelolo%2fGraph/0/c1c5d432407bd5fd7435f5edfb6fdb73ffd4bc9e
> GET /en.wikipedia.org/v1/png/User:Pchelolo%2fGraph/0/c1c5d432407bd5fd7435f5edfb6fdb73ffd4bc9e HTTP/1.1
> User-Agent: curl/7.38.0
> Host: localhost:19000
> Accept: */*
> 
< HTTP/1.1 400 Bad Request
< access-control-allow-origin: *
< access-control-allow-headers: accept, x-requested-with, content-type
< access-control-expose-headers: etag
< x-xss-protection: 1; mode=block
< x-content-type-options: nosniff
< x-frame-options: SAMEORIGIN
< content-security-policy: default-src 'self'; object-src 'none'; media-src *; img-src *; style-src *; frame-ancestors 'self'
< x-content-security-policy: default-src 'self'; object-src 'none'; media-src *; img-src *; style-src *; frame-ancestors 'self'
< x-webkit-csp: default-src 'self'; object-src 'none'; media-src *; img-src *; style-src *; frame-ancestors 'self'
< Cache-Control: public, s-maxage=30, max-age=30
< Content-Type: application/json; charset=utf-8
< Content-Length: 15
< Vary: Accept-Encoding
< Date: Tue, 03 May 2016 09:01:19 GMT
< Connection: keep-alive
< 
"error/unknown"

And the related log line:

{"name":"graphoid","hostname":"scb2001","pid":136,"level":50,"domain":"en.wikipedia.org","format":"png","title":"User:Pchelolo/Graph","revid":"0","id":"c1c5d432407bd5fd7435f5edfb6fdb73ffd4bc9e.png","apicall":{"format":"json","formatversion":"2","action":"graph","title":"User:Pchelolo/Graph","hash":"c1c5d432407bd5fd7435f5edfb6fdb73ffd4bc9e"},"msg":"timeout","levelPath":"error/unknown","request_id":"b172531e-110d-11e6-8772-d7a8e78ed581","time":"2016-05-03T09:02:04.078Z","v":0}

Event Timeline

mobrovac created this task.May 3 2016, 9:02 AM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptMay 3 2016, 9:02 AM
mobrovac triaged this task as High priority.May 3 2016, 9:11 AM
mobrovac assigned this task to Yurik.
mobrovac added a subscriber: Joe.

@Joe tested the API manually while this problem was occurring and no time-outs happened. Moreover, there was only one instance of Graphoid acting up (on scb2001) and after a short period (10 mins or so), it recovered by itself.

At this point it is not clear what exactly happened, but warrants more investigation. @Yurik, assigning to you as the respective service owner.

Joe added a comment.May 3 2016, 9:26 AM

@Joe tested the API manually while this problem was occurring and no time-outs happened. Moreover, there was only one instance of Graphoid acting up (on scb2001) and after a short period (10 mins or so), it recovered by itself.

At this point it is not clear what exactly happened, but warrants more investigation. @Yurik, assigning to you as the respective service owner.

This was due to T134241. This is still a valid bug and should be handled ASAP.

Pchelolo moved this task from Backlog to watching on the Services board.Oct 12 2016, 5:27 PM
Pchelolo edited projects, added Services (watching); removed Services.

This is still occurring, eg. https://en.wikipedia.org/wiki/Template:Graph:PageViews randomly fails:

Request:

:authority: en.wikipedia.org
:method: GET
:path: /api/rest_v1/page/graph/png/Talk%3A%28225088%29_2007_OR10/0/315881f44d0a7453462fd5f5cf357fbf1baf49d8.png
:scheme: https
accept: image/webp,image/apng,image/*,*/*;q=0.8
accept-encoding: gzip, deflate, br
accept-language: en
cache-control: no-cache
cookie: [REDACTED]
dnt: 1
pragma: no-cache
referer: https://en.wikipedia.org/wiki/Talk:(225088)_2007_OR10
user-agent: [REDACTED]

Response:

HTTP/2 400
access-control-allow-headers: accept, content-type, content-length, cache-control, accept-language, api-user-agent, if-match, if-modified-since, if-none-match, dnt, accept-encoding
access-control-allow-methods: GET,HEAD
access-control-allow-origin: *
access-control-expose-headers: etag
age: 0
cache-control: public, s-maxage=300, max-age=300
content-length: 202
content-location: https://en.wikipedia.org/api/rest_v1/page/graph/png/Talk%3A(225088)_2007_OR10/0/315881f44d0a7453462fd5f5cf357fbf1baf49d8.png
content-security-policy: default-src 'none'; frame-ancestors 'none'
content-type: application/problem+json
date: Mon, 09 Jul 2018 08:00:30 GMT
referrer-policy: origin-when-cross-origin
server: restbase1007
status: 400
strict-transport-security: max-age=106384710; includeSubDomains; preload
vary: Accept-Encoding
via: 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)
x-analytics: WMF-Last-Access=09-Jul-2018;WMF-Last-Access-Global=09-Jul-2018;https=1
x-cache: cp1067 pass, cp2001 pass, cp5008 pass, cp5007 pass
x-cache-status: pass
x-client-ip: [REDACTED]
x-content-security-policy: default-src 'none'; frame-ancestors 'none'
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-request-id: 260cd182-834e-11e8-b598-c15f400affe3
x-varnish: 721263148, 804224464, 984740018, 927102236
x-webkit-csp: default-src 'none'; frame-ancestors 'none'
x-xss-protection: 1; mode=block

{"type":"https://mediawiki.org/wiki/HyperSwitch/errors/unknown_error","method":"get","uri":"/en.wikipedia.org/v1/page/graph/png/Talk%3A(225088)_2007_OR10/0/315881f44d0a7453462fd5f5cf357fbf1baf49d8.png"}