Page MenuHomePhabricator

Proton cannot assume the requests are for {lang}.wikipedia.org
Closed, ResolvedPublic

Description

The Proton service assumes that the requests it receives are for wikipedia pages only. However, its aim is to replace the Electron rendering service, which is enabled for all projects. Therefore, Proton cannot assume that the requests for PDFs will be limited to Wikipedia.

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedNone
ResolvedBawolff
Resolvedphuedx
Resolvedmobrovac
Resolvedmobrovac
Resolvedphuedx
ResolvedJdrewniak
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
Resolvedphuedx
DeclinedNone
Resolvedbmansurov
Resolvedmobrovac
Resolvedovasileva
InvalidNone
ResolvedJdlrobson
Resolvedphuedx
Resolvedphuedx
Resolvedholger.knust
ResolvedTgr
Resolvedjijiki
ResolvedMSantos
Resolvedmobrovac
Resolvedovasileva
Resolvedphuedx
Declinedpmiazga
ResolvedDzahn
Resolvedpmiazga
Duplicateholger.knust
ResolvedMSantos
ResolvedTgr
ResolvedJohan
OpenNone
OpenNone
InvalidNone
Resolvedmobrovac
Resolvedmobrovac

Event Timeline

mobrovac triaged this task as High priority.Jun 29 2018, 9:35 AM
mobrovac created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 29 2018, 9:35 AM

Maybe instead of doing a restbase check (calling /page/title/{TITLE}, first we can do a HEAD call to the requested URL and check the http response. if it's 200 proceed with the queue, otherwise reject the job immediately? With that approach, we will be able to handle all possible projects.

The RESTBase call is not wikipedia-specific, it supports all the wikis where RESTBase is enabled. The problem is in this fragment, which assumes a specific format for the domain.

That part is done to handle the mobile domains (adding .m part). There is no nice way to retrieve the mobile domain for a given wiki. We already did some research on this case, please check:

There is also some small conversation in gerrit patches

That part is done to handle the mobile domains (adding .m part). There is no nice way to retrieve the mobile domain for a given wiki.

have it defined in config?

How to build the mobile domain in the Proton service -> yes, the template is stored in the config: https://github.com/wikimedia/mediawiki-services-chromium-render/blob/master/config.dev.yaml#L66

mobrovac claimed this task.Jul 2 2018, 2:51 PM
mobrovac edited projects, added Services (doing); removed Services (blocked).

We actually also need to do this because of security implications: we want to restrict the service to only be allowed to talk to our production MW servers directly.

That part is done to handle the mobile domains (adding .m part). There is no nice way to retrieve the mobile domain for a given wiki. We already did some research on this case, please check:

Actually, there is. As far as our production servers are considered, .m. domains do not exist, only their desktop counterparts. In other words, both en.wp.org/wiki/Title and en.m.wp.org/wiki/Title get interpreted in the same way (as being for the en.wp.org domain). I will try to add the logic to Proton and change the MW request template to reflect that.

Change 443444 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[operations/puppet@production] service::node: Expose the MW appservers' host to modules

https://gerrit.wikimedia.org/r/443444

Change 443465 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/chromium-render@master] Tighten the MW request

https://gerrit.wikimedia.org/r/443465

Change 443468 had a related patch set uploaded (by Mobrovac; owner: Mobrovac):
[mediawiki/services/chromium-render/deploy@master] Config: Improve the MW request template

https://gerrit.wikimedia.org/r/443468

The three patches above should get us what we want, as they allow us to restrict the hosts we issue requests to in production (these being only the MW app servers), all the while allowing us to fetch both desktop and mobile views of pages.

Change 443444 merged by Giuseppe Lavagetto:
[operations/puppet@production] service::node: Expose the MW appservers' host to modules

https://gerrit.wikimedia.org/r/443444

Change 443465 merged by Mobrovac:
[mediawiki/services/chromium-render@master] Tighten the MW request

https://gerrit.wikimedia.org/r/443465

Change 443468 merged by Mobrovac:
[mediawiki/services/chromium-render/deploy@master] Config: Improve the MW request template

https://gerrit.wikimedia.org/r/443468

Mentioned in SAL (#wikimedia-operations) [2018-07-26T19:58:28Z] <mobrovac@deploy1001> Started deploy [proton/deploy@883cacd]: Use a more secure MW API template - T198461

Mentioned in SAL (#wikimedia-operations) [2018-07-26T19:59:02Z] <mobrovac@deploy1001> Finished deploy [proton/deploy@883cacd]: Use a more secure MW API template - T198461 (duration: 00m 33s)

mobrovac closed this task as Resolved.Jul 26 2018, 7:59 PM
mobrovac edited projects, added Services (done); removed Patch-For-Review, Services (doing).

Deployed, resolving.