The Proton service assumes that the requests it receives are for wikipedia pages only. However, its aim is to replace the Electron rendering service, which is enabled for all projects. Therefore, Proton cannot assume that the requests for PDFs will be limited to Wikipedia.
Description
Details
Event Timeline
Maybe instead of doing a restbase check (calling /page/title/{TITLE}, first we can do a HEAD call to the requested URL and check the http response. if it's 200 proceed with the queue, otherwise reject the job immediately? With that approach, we will be able to handle all possible projects.
The RESTBase call is not wikipedia-specific, it supports all the wikis where RESTBase is enabled. The problem is in this fragment, which assumes a specific format for the domain.
That part is done to handle the mobile domains (adding .m part). There is no nice way to retrieve the mobile domain for a given wiki. We already did some research on this case, please check:
- https://phabricator.wikimedia.org/T181680#3919018
- https://phabricator.wikimedia.org/T181680#3950130 + @Pchelolo response.
There is also some small conversation in gerrit patches
How to build the mobile domain in the Proton service -> yes, the template is stored in the config: https://github.com/wikimedia/mediawiki-services-chromium-render/blob/master/config.dev.yaml#L66
We actually also need to do this because of security implications: we want to restrict the service to only be allowed to talk to our production MW servers directly.
Actually, there is. As far as our production servers are considered, .m. domains do not exist, only their desktop counterparts. In other words, both en.wp.org/wiki/Title and en.m.wp.org/wiki/Title get interpreted in the same way (as being for the en.wp.org domain). I will try to add the logic to Proton and change the MW request template to reflect that.
Change 443444 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[operations/puppet@production] service::node: Expose the MW appservers' host to modules
Change 443465 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/chromium-render@master] Tighten the MW request
Change 443468 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/chromium-render/deploy@master] Config: Improve the MW request template
The three patches above should get us what we want, as they allow us to restrict the hosts we issue requests to in production (these being only the MW app servers), all the while allowing us to fetch both desktop and mobile views of pages.
Change 443444 merged by Giuseppe Lavagetto:
[operations/puppet@production] service::node: Expose the MW appservers' host to modules
Change 443465 merged by Mobrovac:
[mediawiki/services/chromium-render@master] Tighten the MW request
Change 443468 merged by Mobrovac:
[mediawiki/services/chromium-render/deploy@master] Config: Improve the MW request template
Mentioned in SAL (#wikimedia-operations) [2018-07-26T19:58:28Z] <mobrovac@deploy1001> Started deploy [proton/deploy@883cacd]: Use a more secure MW API template - T198461
Mentioned in SAL (#wikimedia-operations) [2018-07-26T19:59:02Z] <mobrovac@deploy1001> Finished deploy [proton/deploy@883cacd]: Use a more secure MW API template - T198461 (duration: 00m 33s)