Page MenuHomePhabricator

X-Wikimedia-Debug header does nothing on Toolforge web services
Closed, ResolvedPublic

Description

According to https://wikitech.wikimedia.org/wiki/Help:Toolforge/Web/Lighttpd#Error pages:

The proxy provides its own error pages when your application returns HTTP/500, HTTP/502 or HTTP/503. This behavior is currently under review, and might change in the near future.
You can bypass the proxy error pages by passing an X-Wikimedia-Debug header.

This does not work as described. With the webservice for tools.dplbot down, I tried the following:

tools.dplbot@tools-sgebastion-07:~$ curl -L https://tools.wmflabs.org/dplbot/ > curlout.1
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2384  100  2384    0     0  34341      0 --:--:-- --:--:-- --:--:-- 34550
tools.dplbot@tools-sgebastion-07:~$ curl -H "X-Wikimedia-Debug:1" -L https://tools.wmflabs.org/dplbot/ > curlout.2
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2384  100  2384    0     0  46331      0 --:--:-- --:--:-- --:--:-- 66222
tools.dplbot@tools-sgebastion-07:~$ diff curlout.1 curlout.2
tools.dplbot@tools-sgebastion-07:~$

Either the documentation is incorrect, or the feature is not working correctly.

Event Timeline

Looks like the proxy never sets it anymore, afaict?

tools.yifeibot@tools-sgebastion-08:~/public_html$ cat status.php 
<?php
http_response_code(intval($_GET["status"]));
tools.yifeibot@tools-sgebastion-08:~/public_html$ curl -v https://tools.wmflabs.org/yifeibot/status.php?status=404
*   Trying 172.16.0.43...
* TCP_NODELAY set
* Connected to tools.wmflabs.org (172.16.0.43) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS header, Certificate Status (22):
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=toolforge.org
*  start date: Dec 20 23:00:30 2019 GMT
*  expire date: Mar 19 23:00:30 2020 GMT
*  subjectAltName: host "tools.wmflabs.org" matched cert's "tools.wmflabs.org"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55ec03c75e90)
> GET /yifeibot/status.php?status=404 HTTP/1.1
> Host: tools.wmflabs.org
> User-Agent: curl/7.52.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 404 
< server: nginx/1.14.2
< date: Mon, 27 Jan 2020 05:12:47 GMT
< content-type: text/html; charset=UTF-8
< content-length: 0
< 
* Curl_http_done: called premature == 0
* Connection #0 to host tools.wmflabs.org left intact
tools.yifeibot@tools-sgebastion-08:~/public_html$ curl -v https://tools.wmflabs.org/yifeibot/status.php?status=500
*   Trying 172.16.0.43...
[...]
* Using Stream ID: 1 (easy handle 0x559804a97e90)
> GET /yifeibot/status.php?status=500 HTTP/1.1
> Host: tools.wmflabs.org
> User-Agent: curl/7.52.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 500 
< server: nginx/1.14.2
< date: Mon, 27 Jan 2020 05:13:01 GMT
< content-type: text/html; charset=UTF-8
< content-length: 0
< 
* Curl_http_done: called premature == 0
* Connection #0 to host tools.wmflabs.org left intact
tools.yifeibot@tools-sgebastion-08:~/public_html$ curl -v https://tools.wmflabs.org/yifeibot/status.php?status=502
*   Trying 172.16.0.43...
[...]
* Using Stream ID: 1 (easy handle 0x556ecb35ce90)
> GET /yifeibot/status.php?status=502 HTTP/1.1
> Host: tools.wmflabs.org
> User-Agent: curl/7.52.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 502 
< server: nginx/1.14.2
< date: Mon, 27 Jan 2020 05:13:03 GMT
< content-type: text/html; charset=UTF-8
< content-length: 0
< 
* Curl_http_done: called premature == 0
* Connection #0 to host tools.wmflabs.org left intact
tools.yifeibot@tools-sgebastion-08:~/public_html$ curl -v https://tools.wmflabs.org/yifeibot/status.php?status=503
*   Trying 172.16.0.43...
[...]
* Using Stream ID: 1 (easy handle 0x563108181e90)
> GET /yifeibot/status.php?status=503 HTTP/1.1
> Host: tools.wmflabs.org
> User-Agent: curl/7.52.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 503 
< server: nginx/1.14.2
< date: Mon, 27 Jan 2020 05:13:05 GMT
< content-type: text/html; charset=UTF-8
< content-length: 0
< 
* Curl_http_done: called premature == 0
* Connection #0 to host tools.wmflabs.org left intact

Both error page handling (T103662) and this debug setup is under review. This is affected by our current effort to introduce the new kubernetes cluster into Toolforge, which has a new ingress mechanism and new error page handling mechanism.
You can read more here: https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Networking_and_ingress
Moreover, when we finally introduce the new domain toolforge.org (something we plan to do this quarter), error handling and debugging options may need to be reworked completely on our side.

Why is that?

The previous setup was "simpler". The frontproxy (dynamicproxy) had a couple of special cases to handle debug headers and error page handling (using the admin tool). With the new domain and new ingress, this is no longer as easier as it used to be. We are no longer relying on the admin tool.

We have this in the frontproxy nginx setup https://github.com/wikimedia/puppet/blob/production/modules/dynamicproxy/templates/urlproxy.conf#L120:

# Redirect bare requests for the homepage to the admin tool
rewrite ^/$ https://$host/admin/ redirect;

location /.well-known/healthz {
    return 200 'proxy ok!';
}

location /.error/ {
    alias /var/www/error/;
    default_type text/html;
}

location /.error/banned/ {
    error_page 403 /.error/banned.html;
    return 403;
}

location /.error/technicalissues/ {
    error_page 503 /.error/errorpage.html;
    return 503;
}

With the new routing scheme (previously tools.wmflabs.org/$toolname, now $toolname.toolforge.org) this config no longer makes sense.

Same for debug headers https://github.com/wikimedia/puppet/blob/production/modules/dynamicproxy/templates/urlproxy.conf#L156:

# Let errors go through to the client if using a special debug header
# (same as prod). Set X-Wikimedia-Debug in your request to trigger it.
# See https://wikitech.wikimedia.org/wiki/Debugging_in_production for more info
if ($http_x_wikimedia_debug) {
    #error_page 403 /admin/?403; # bug 64393
    #error_page 404 /admin/?404; # bug 64393
    error_page 500 /admin/?500;
    error_page 502 /admin/?502;
    error_page 503 /admin/?503;
}

You can follow ongoing work in this other ticket: T234617: Toolforge. introduce new domain toolforge.org

@russblau could you please describe what behavior would you like to see / you we expecting?

Honestly I don’t know, except that based on the wiki I expected something different from the proxy’s error page. That is why I suggested the possibility that the problem might be with the documentation.

bd808 claimed this task.
bd808 subscribed.

The documentation was incorrect. The X-Wikimedia-Debug header method never really worked as intended. I have removed mention of it from wikitech page.