Page MenuHomePhabricator

tool-labs error pages HTTP/400 for POSTs
Closed, ResolvedPublic

Description

Example:
https://tools.wmflabs.org/gerrit-reviewer-bot/trigger400.php

<form method="POST"><textarea name="boo">fdsfdsafdsafda</textarea><input type=submit></form>
<?php
if ($_SERVER['REQUEST_METHOD'] == "POST") {
    header('HTTP/1.1 500 Internal Server Error');
}

The HTTP/400 is due to

2016-01-08 20:19:30: (request.c.1108) GET/HEAD with content-length -> 400

Event Timeline

valhallasw raised the priority of this task from to Needs Triage.
valhallasw updated the task description. (Show Details)
valhallasw added a project: Toolforge.
valhallasw added a subscriber: valhallasw.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript

The error is caused by the proxy:

scfc@tools-bastion-01:~$ curl -iH 'Host: tools.wmflabs.org' -d boo=fdsfdsafdsafda https://tools.wmflabs.org/gerrit-reviewer-bot/trigger400.php HTTP/1.1 500 Internal Server Error
Server: nginx/1.9.4
Date: Fri, 08 Jan 2016 21:30:13 GMT
Content-Type: text/html
Content-Length: 349
Connection: close

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
         "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head>
  <title>400 - Bad Request</title>
 </head>
 <body>
  <h1>400 - Bad Request</h1>
 </body>
</html>
scfc@tools-bastion-01:~$ curl -iH 'Host: tools.wmflabs.org' -d boo=fdsfdsafdsafda http://tools-webgrid-lighttpd-1402.tools.eqiad.wmflabs:48879/gerrit-reviewer-bot/trigger400.php 
HTTP/1.1 500 Internal Server Error
X-Powered-By: PHP/5.5.9-1ubuntu4.14
Content-type: text/html
Transfer-Encoding: chunked
Date: Fri, 08 Jan 2016 21:30:38 GMT
Server: lighttpd/1.4.33

<form method="POST"><textarea name="boo">fdsfdsafdsafda</textarea><input type=submit></form>
scfc@tools-bastion-01:~$

(NB: Both yield "HTTP/1.1 500 Internal Server Error"; the content is replaced.)

IIRC in the past the (unsolved?) problem was that the proxy rewrites error pages by redirecting (?) them to the tools.admin web, and that can't cope with those:

tools.admin@tools-bastion-01:~$ tail -10000 access.log | fgrep https://tools.wmflabs.org/gerrit-reviewer-bot/trigger400.php; tail -10000 error.log | fgrep 21:21:49
10.68.21.49 - - [08/Jan/2016:20:38:13 +0000] "GET /admin/?500 HTTP/1.1" 400 349 "https://tools.wmflabs.org/gerrit-reviewer-bot/trigger400.php" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0"
10.68.21.49 - - [08/Jan/2016:21:21:49 +0000] "GET /admin/?500 HTTP/1.1" 400 349 "https://tools.wmflabs.org/gerrit-reviewer-bot/trigger400.php" "Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:43.0) Gecko/20100101 Firefox/43.0"
2016-01-08 21:21:49: (request.c.1108) GET/HEAD with content-length -> 400 
2016-01-08 21:21:49: (request.c.1108) GET/HEAD with content-length -> 400 
2016-01-08 21:21:49: (request.c.1108) GET/HEAD with content-length -> 400 
tools.admin@tools-bastion-01:~$

Doh. I had figured that out, but then written the task description in a way that completely failed to mention it. Sorry for that :(

Yes, the issue is indeed that the request is re-sent as a GET to /admin/?500, but keeping headers intact (not sure why...). Lighttpd then returns a HTTP/400 because the request then becomes illegal.

With a bit of help from tcpdump:

GET /admin/?500 HTTP/1.1
Connection: close
Host: tools.wmflabs.org
X-Forwarded-Proto: https
X-Original-URI: /gerrit-reviewer-bot/trigger400.php
Content-Length: 18
cache-control: max-age=0
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
origin: https://tools.wmflabs.org
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
content-type: application/x-www-form-urlencoded
dnt: 1
referer: https://tools.wmflabs.org/gerrit-reviewer-bot/trigger400.php
accept-encoding: gzip, deflate
accept-language: en-GB,en;q=0.8,en-US;q=0.6,nl;q=0.4,fr;q=0.2
cookie: ...

boo=fdsfdsafdsafda

So it indeed seems the full headers are passed to lighttpd, and the only thing that happens is that the POST <something> becomes a GET /admin/?500.

Upstreamed as https://trac.nginx.org/nginx/ticket/876#ticket

Fwiw, uwsgi is less squeamish than lighttpd about incorrect requests. Nginx itself also complains about the request when it receives it.

It might also be possible to work around this on the nginx level by using proxy_pass_request_headers and proxy_pass_request_body on just /admin.

Upstream has recommended just that (and declined the bug :-)).

Another idea I have been thinking about in the past: Move (all) the admin web stuff to nginx. There is a Lua MySQL driver to query the tools database, and for querying the grid status, if we stick to Jessie for the proxies, we could set up a tiny daemon on the grid master that exposes the output of qstat/qhost via http that is then consumed by a Lua script on the proxy (or expose the contents of the tools database via http from there as well, so that we don't have to handle that in Lua).

As we didn't have many patches from outsiders for the admin web despite it being PHP and knowledge about Lua being spread sufficient enough due to the MediaWiki scripts, we wouldn't lose anything and on other hand the code flow in the existing web pages is so hard to follow (IMHO) that a rewrite would be beneficial for maintenance.

But, if the problem at hand can be fixed with a one-line patch that sets proxy_pass_request_body … :-)

Yeah, that's definitely a reasonable option, although the 'eat your own dog food' approach of the front page also has its charm. As for this specific issue, we could killing the nginx error page altogether (your suggestion in T103662#1396481), or taking the lua prefix/postfix approach (my suggestion in T103662#1585277).

taavi added a subscriber: taavi.

I believe we only show custom error pages on connection pages currently.

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!