Page MenuHomePhabricator

Varnish/ATS should not decode URIs for /w/rest.php
Closed, ResolvedPublic

Description

Up until now, MediaWiki ingested all needed parameters through the query string (be it /w/index.php or /w/api.php), meaning that decoding the parameters prior to handling them over made sense (and.moreover, was desirable). However, with the advent of the REST API implementation in MediaWiki, we will have to change the rules of the game. Just like it is currently being done for /api/rest_v1/, Varnish and ATS should leave the original URI path untouched (read: encoded) for /w/rest.php. First reported in T235375, it was discovered in T235439 that the decoding of the URI happens on the Varnish layer. This is undesirable because it makes it impossible for the REST route matching algorithm to find the appropriate handler. E.g., when issuing a request for the path /w/rest.php/en.wikipedia.beta.wmflabs.org/v3/page/pagebundle/User%3APchelolo%2FOnDemand_Test/275844, Varnish decodes %3A and %2F into : and /, respectively, and passes /w/rest.php/en.wikipedia.beta.wmflabs.org/v3/page/pagebundle/User:Pchelolo/OnDemand_Test/275844 to the appserver. This, in turn, prompts MW to respond with a 404 because of the extra / in the path.

With the move from Parsoid/JS to Parsoid/PHP underway, this is somewhat urgent matter since it means we can't use the Parsoid/PHP REST API in Beta from the outside.

The manifestation of the problem can be witnessed by trying to contact Parsoid/PHP from inside and outside of Beta:

external call
$ curl -H'Host: en.wikipedia.beta.wmflabs.org' https://en.wikipedia.beta.wmflabs.org/w/rest.php/en.wikipedia.beta.wmflabs.org/v3/page/pagebundle/User%3APchelolo%2FOnDemand_Test/275844 -v
< HTTP/2 404 
< server: deployment-mediawiki-parsoid10.deployment-prep.eqiad.wmflabs
< x-powered-by: PHP/7.2.22-1+0~20190902.26+debian9~1.gbpd64eb7+wmf1
< x-varnish: 108733149, 42466902
< x-cache: deployment-cache-text05 pass, deployment-cache-text05 pass
< x-cache-status: pass
inside beta
$ curl -H 'Host: en.wikipedia.beta.wmflabs.org' deployment-mediawiki-parsoid10/w/rest.php/en.wikipedia.beta.wmflabs.org/v3/page/pagebundle/User%3APchelolo%2FOnDemand_Test/275844 -v
< HTTP/1.1 200 OK
< Date: Tue, 15 Oct 2019 09:05:41 GMT
< Server: deployment-mediawiki-parsoid10.deployment-prep.eqiad.wmflabs
< X-Powered-By: PHP/7.2.22-1+0~20190902.26+debian9~1.gbpd64eb7+wmf1

Event Timeline

mobrovac created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

In ATS-land, URI path normalization can be configured by passing the right parameters to the script normalize-path.lua. See the current restbase config as an example.

This quarter we will carry on with the conversion of the cache_text cluster from Varnish to ATS. @mobrovac and I agreed on IRC that converting deployment-cache-text05 to ATS and configuring the lua normalization plugin as appropriate is the way to go to tackle this.

This quarter we will carry on with the conversion of the cache_text cluster from Varnish to ATS. @mobrovac and I agreed on IRC that converting deployment-cache-text05 to ATS and configuring the lua normalization plugin as appropriate is the way to go to tackle this.

To clarify, switching to ATS in beta will unblock the current show-stopper of not being able to contact Parosid/PHP in beta properly (needed to further the progress of porting Parsoid). That said, the Parsoid/PHP API will not be available in production externally (for now). However, further developments will be needed once the MW REST API hits production.

Change 545369 had a related patch set uploaded (by Ppchelko; owner: Ppchelko):
[operations/puppet@production] Varnish: don't decode/encode slashes for core REST API paths

https://gerrit.wikimedia.org/r/545369

Change 545369 merged by Ema:
[operations/puppet@production] Varnish: don't decode/encode slashes for core REST API paths

https://gerrit.wikimedia.org/r/545369

Ottomata subscribed.

Ema this looks done, feel free to resolve.