Page MenuHomePhabricator

Investigate access checks for Proton: ensure it follows redirects to be compatible with RESTbase
Open, Stalled, HighPublic

Description

When proton is configured with RESTBase, as an access check, RESTBase follows redirects on page reads/getting article content.

The idea of this task is to ensure proton also does the same thing like RESTBase so that if we hit Proton direct on a page that is a redirect, we should have the content of the target page printed instead.

  • Normalization redirect on Title
  • Wiki redirect on Title
  • Language-variant redirect ($2/$1 or variant= or accept-language).

Event Timeline

RESTBase follows redirects so what Proton gets is the article of the target page to print. From testing locally, this is still the case.

Testing locally: curl -X 'GET' http://mediawiki.development.instance:3030/en.wikipedia.beta.wmflabs.org/v1/pdf/User:XSavitarTest%2Fsandbox%2FRestbaseProton/a4/desktop -H 'accept: application/pdf' --output ~/Desktop/RedirectTest.pdf seems to reveal that Proton directly follows redirects when trying to print the article as this seems to be the normal mediawiki behavior except the redirect=no is set in the URL.

@Jgiannelos & @pmiazga, can you confirm with me that right now, Proton running in isolation follow redirects because that's the default MW way except explicitly specified to not follow redirects?

RESTBase is currently following redirects meaning that there is nothing to do with Proton at this point. It's working as it should.

Might be worth double-checking language-variant redirects as well.

DAlangi_WMF changed the task status from Open to In Progress.Mar 2 2023, 8:01 PM

@cscott, I've been thinking about this and so far, when it comes to proton, things happening in the MW world are all MW based. All proton does is calls MediaWiki directly when it comes hitting a page to render it's PDF.

In line with redirects, normalization & wiki redirects gets handled correctly with a slight change in behavior (when printing for wiki redirects) as it puts the "(Redirect from ...)" in the PDF but RESTBase would hide this output property but nothing to worry about for now.

With respect to what you mentioned about language-variant redirects, I thought MW handles this as well and if that be the case, then we don't need to do anything at the level of proton for this. Can you confirm to me if MW resolves language-variant redirects by default using the /index.php entry point?

Proton just calls MediaWiki and delegates redirect mechanisms to it so in this case, redirects such as:

  • Normalization redirects,
  • Wiki redirects,
  • Language-variant redirects etc

are all handled from the MediaWiki side (via the browser).

I've spotted a patch about variants here: https://gerrit.wikimedia.org/r/c/mediawiki/services/chromium-render/+/897192, so reopening this issue for some more investigation.

So it seems the URI construction in Proton silently doesn't consider the language variant if passed in. The link to the patch above tries to resolve this issue. Will follow up there.

Change 897192 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[mediawiki/services/chromium-render@master] Services: Pass variant parameter to chromium-render

https://gerrit.wikimedia.org/r/897192

After a conversation with Daniel yesterday, and a few findings, there are some points to consider:

There are several ways of doing language-variant using MediaWiki, follow below. Cc @cscott

  1. Language variant can be configured using the $wgVariantArticlePath (see manual) per wiki/farm. When done this way, one can have an HTTP request like: localhost:8080/sr-el/Page_name to access Page_name in the language variant specified.
  1. Language variants conversion can also be triggered using the variant= URL param in the following manner: localhost:8080/wiki/Page_name?variant=sr-el. See this manual (in the table, check variant).
  1. Variant redirect/conversion can be triggered with the accept-language header when set which is another way done within RESTBase or even in MediaWiki core.

@cscott, is there another way that I'm missing, please let me know.

Taking all these into consideration, Daniel and I realized looked at RESTBase on how it handles language-variant for PDF rendering and it turns out RESTBase doesn't have handling of this for proton. Cc @Jgiannelos and @pmiazga for awareness/confirmation.

So the question of whether we want proton as a stand-alone service to support language variant redirect/conversion looks like a new feature request at this point, so should probably go no a new ticket. There is a patch to introduce it here: https://gerrit.wikimedia.org/r/c/mediawiki/services/chromium-render/+/897192, Pmiazga/Yiannis, what do you think?

The current chromium-render service use "https://{{extdomain}}/w/index.php" according to config.dev.yaml#L68 .

I would suggest NOT to use the variant article path ($wgVariantArticlePath) as the solution, as:

  • It needs additional configuration as article path ($wgArticlePath) does
  • It adds complexity for the URL handling part

Also, for reference:

  • Title::getPageViewLanguage() return the language object for page view langage
    • Language::getCode() return MediaWiki internal language code
    • Language::getHtmlCode() return BCP 47 language code
  • We currently use MediaWiki internal language code first (with compatibility for BCP 47 language code) for variant URL parameter
  • Accept-Language should use BCP 47 language code instead of MediaWiki internal language code
DAlangi_WMF changed the task status from Open to Stalled.Apr 13 2023, 3:46 PM

Until migration is complete, then we can start adding new features to Proton.

daniel triaged this task as High priority.Jun 5 2023, 6:16 PM
daniel moved this task from Unsorted to Doing on the RESTBase Sunsetting board.

@DAlangi_WMF Is this done? Can we close the ticket?

It is done apart from the language-variant redirect which there is a pending patch but we've frozen accepting new feature request until the service is migrated completely to rest-gateway. That's why I marked the task as stalled for now.