Various pages on Wikitech, document that one can make locally emulate external requests an Apache using the Host-header idiom. This makes sense.
For example, at https://wikitech.wikimedia.org/wiki/Application_servers and https://wikitech.wikimedia.org/wiki/Debugging_in_production#Locally. Typically something like:
mwdebug1001$ curl -H 'Host: en.wikipedia.org' "http://localhost/" ... HTTP/1.1 301 Moved Permanently Server: mwdebug1001.eqiad.wmnet Location: https://en.wikipedia.org/wiki/Main_Page
or
mwdebug1001$ curl -H 'Host: en.wikipedia.org' "http://localhost/w/load.php" ... HTTP/1.1 200 OK Server: mwdebug1001.eqiad.wmnet .. .. This file is the entry point for ResourceLoader ..
However, as of writing, this is not working. Instead, virtually any attempted url yields a 404 Not Found.
mwdebug1002:~$ curl -v -H 'Host: test.wikipedia.org' "http://localhost/w/load.php" * Hostname was NOT found in DNS cache * Trying ::1... * Connected to localhost (::1) port 80 (#0) > GET /w/load.php HTTP/1.1 > User-Agent: curl/7.38.0 > Accept: */* > Host: test.wikipedia.org > < HTTP/1.1 404 Not Found < Date: Mon, 19 Mar 2018 23:43:16 GMT * Server mwdebug1002.eqiad.wmnet is not blacklisted < Server: mwdebug1002.eqiad.wmnet < Content-Length: 327 < Content-Type: text/html; charset=iso-8859-1 < <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>404 Not Found</title> </head><body> <h1>Not Found</h1> <p>The requested URL /w/load.php was not found on this server.</p> <p>Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.</p> </body></html> * Connection #0 to host localhost left intact
At different points over the past few years, this was broken before, each time we found a work around. At one point, I recall, it was important for (forgotten reasons, something about HTTPS) to leave the url unchanged and instead swap the TCP destination via DNS, as follows:
$ curl -v --resolve 'test.wikipedia.org:80:127.0.0.1' "http://test.wikipedia.org/w/load.php"But this too doesn't work.
At another point, it was important to include -H 'X-Forwarded-Proto: https'. I think that's still the case for some things, but at the Apache level, most things support both now, with Vary.
I've tried many different variations, none work.
- curl -v -H 'Host: test.wikipedia.org' "http://localhost/w/load.php" (plain, with -6, with -g, with -g6)
- curl -v -H 'Host: test.wikipedia.org' "http://127.0.0.1/w/load.php"
- curl -v -H 'Host: test.wikipedia.org' "http://[::1]/w/load.php" (plain, with -6, with -g, with -g6)
- curl -v --resolve 'test.wikipedia.org:80:127.0.0.1' "http://test.wikipedia.org/w/load.php"
- curl -v --resolve 'test.wikipedia.org:80:::1' "http://test.wikipedia.org/w/load.php" (plain, with -6, with -g, with -g6)
Eventually, I tried it from a different host to see if that would work. And by my surprise, that worked:
mwdebug1002$ curl -v -H 'Host: test.wikipedia.org' "http://mwdebug1001.eqiad.wmnet/w/load.php"` .. HTTP/1.1 200 OK Server: mwdebug1001.eqiad.wmnet ..
It also works from mwdebug1001 itself, and it works when using mwdebug's local 10.x IP address. These all do work:
- mwdebug1002$ curl -v -H 'Host: test.wikipedia.org' "http://mwdebug1001.eqiad.wmnet/w/load.php"
- mwdebug1001$ curl -v -H 'Host: test.wikipedia.org' "http://mwdebug1001.eqiad.wmnet/w/load.php"
- mwdebug1001$ curl -v -H 'Host: test.wikipedia.org' "http://10.64.32.123.eqiad.wmnet/w/load.php"
- mwdebug1001$ curl -v --resolve 'test.wikipedia.org:80:10.64.32.123' "http://test.wikipedia.org/w/load.php"
The first thing that came to mind at this point is that maybe something is doing the opposite of Require local and denying connections for all production sites from locally initiated connections. However, even if such thing were to exist, there are two things contradicting it:
- Locally initiating was still possible when using the local IP.
- It responds with our the custom default VirtualHost, not with an error page.
This last point is important. When removing the path component of the url and revealing the document root, shows that it does actually match one of our VirtualHost configurations, just not the one it is supposed to.
mwdebug1002:~$ curl -v -H 'Host: test.wikipedia.org' "http://localhost/" * Connected to localhost (::1) port 80 (#0) > Host: test.wikipedia.org > [..] < HTTP/1.1 200 OK [..] < Server: mwdebug1002.eqiad.wmnet [..] < <!DOCTYPE html> <html lang=en> <meta charset="utf-8"> <title>Unconfigured domain</title> <link rel="shortcut icon" href="//wikimediafoundation.org/favicon.ico"> <style> [..]
This is /srv/mediawiki/docroot/default/index.html as configured by puppet:/modules/mediawiki/files/apache/sites/nonexistent.conf.
So how come it is matching that one but not the main ones?