Page MenuHomePhabricator

Core Page REST API should follow redirects
Closed, ResolvedPublic

Description

For feature parity with RESTBase as well as with index.php, the v1/page/{title} endpoint should follow redirects. We should respond with 30x status code and a Location header for the redirect target.

Note that there are different categories of redirects that we may have to distinguish:

Wiki Redirects

Wiki redirects are defined explicity using the #REDIRECT syntax. Following these should be enabled per default, but can be disabled using redirect=false. index.php does this, and the RESTbase endpoitns do this.

We can also include a representation of the redirect in the body of the response - non-browser API clients will be able to read it. Note that the representation may be in JSON, HTML, or JSON and HTML, depending on whether the request was made to v1/page/{title}, v1/page/{title}/html or v1/page/{title}/with_html.

Normalization Redirects

Normalization redirects happen when the Title requested isn't the canonical one. We should only serve content from the canonical title. Title normalization includes:

  • Converting the first letter to upper-case and converting spaces to underscores.
  • Normalizing unicode, resolving spurious HTML or URL encoding
  • Applying variant conversion

The redirect itself should be permanent and cacheable.

Revision Redirects

We could always redirect from the v1/page endpoint to the v1/revision endpoint for the current revision (or the current flagged revision?). But that really only makes sense for v1/page/{title}/html. Would it be confusing if we don't do it for the metadata endpoing v1/page/{title}?

Also, should the redirect be cacehable? It will need purging...

Acceptance Criteria

Normalization redirects:

  • Core HTML REST API can perform normalization redirects
  • Normalization redirects should be permanent and cacheable.
  • v1/page/{title} and v1/page/{title}/bare
  • v1/page/{title}/html and v1/page/{title}/with_html
  • v1/page/{title}/history and v1/page/{title}/history/counts/{type}
  • v1/page/{title}/links/language

Wiki redirects:

  • Wiki redirects can be disabled using redirect=no
  • Wiki redirects should be temporary and not cacheable.
  • Wiki redirects include an HTML representation of the redirect in the body of the response
  • v1/page/{title}/html and v1/page/{title}/with_html should follow wiki redirects
  • v1/page/{title} and v1/page/{title}/bare and v1/page/{title}/with_html should indicate redirect target as part of the JSON output but not follow redirects

Related Objects

Event Timeline

Reedy renamed this task from Core HTML REST API should follow redirects to Core HTML REST API should follow redirects.Feb 9 2022, 4:22 PM

It's not quite clear to me how this should behave for POST requests.

If X is a redirect to Y, and GET request for X would redirect to Y. What should a POST or PUT request to X do? Fail unless redirect=false is given? Or just edit X?

Note that the redirects functionality in RESTBase has some known bugs as well. There are the following separate aspects to how redirects "should" work:

  1. Ordinary mediawiki title munging (space to underscore and all the rest) which is done by the JS mediawiki-title library and is itself quite complex; but is thankfully standalone with some independent implementations. This usually comes "for free" when done in core, but requires an explicit redirect if you don't want edge caching to store multiple copies of the same content.
  1. [[#REDIRECT]] processing: Parsoid actually returns an HTML document for the redirect page, so this is "easy" to do in a decoupled manner for Parsoid endpoints, but for non-parsoid endpoints this requires a separate interaction w/ core to resolve.
  1. Language converter title munging: if the requested title is missing, it is language converted and if the converted title is present it is treated as a redirect. (I think there /might/ be multiple levels of this with language variant fallback chains.) I have a partial reimplementation of language converter in a language-independent FST but we've never really resolved whether we're going to move to that. If we don't have an independent implementation, this has to be resolved via a request to core for services which aren't implemented in core. See T257965: [Bug] page/summary and page/mobile-html do not handle specific redirect titles correctly and T277059: [Bug] Blue links are broken on the Serbian (Latin) Burek article on iOS and Android for some examples of the issues here.
  1. Flagged revisions -- once we have the title we need a revision on that title, and flagged revisions can affect that. Not only that, but your user identity determines which revision you get, since a user can see their own revisions even if they haven't been flagged. (But when you're requesting content for editing you probably want to bypass the flagged revision and always use the latest revision, to avoid conflicts?) T209936: Content of unaccepted pending revisions show up in RESTBase APIs and T218090: Extension:FlaggedRevs can't control which revision is used on API action=parse calls. are related.
  1. Not really investigated, but the Translate extension also hooks the title lookup mechanism, so that a request for [[Template:Foo]] will get [[Template:Foo/de]] in some cases: T47096: Add a way to transclude template or other page in the correct language. Presumably requests for [[Foo]] should also go to [[Foo/de]] on multi-language wikis where the user's language is set to de.

I guess my bottom line here is that this lookup/redirect mechanism is surprisingly complex and probably needs to be yoinked out of "restbase" and put back into core. All of the APIs which "only take a title" need to be refactored to take a revision ID instead, and then the title->revision ID lookup handled consistently (with appropriate options for "my flagged revision userid is X" and "my user language/variant is Y" which can affect the lookup). If this mapping is cached somewhere, then the cache needs to be invalidated when new pages are added (which can be language converted or translated versions of pages you looked up) and when flagged revisions change.

This would also fix @daniel's issues with POST above as well: the POST should always be to a specific revision ID and the title->revision mapping should be completely separated out from any POST API.

Global user pages are another instance which *perhaps* should be handled via the generic redirect service, despite the title of this task: T153801: File and global user pages should not be redirected. (I think the title is mostly referring to the fact that the language settings should be taken into account when the redirect is done, which was mentioned in the comment above.)

Pro:

  • an HTML 302 redirect can easily handle cross-domain redirects, ie en.wikipedia.org/..../revision/for/User:cscott redirecting to mediawiki.org/page/html/User:cscott

Con:

  • perhaps this is really more like a /transclusion/ than a /redirect/? I don't know if Parsoid's current DOM spec has a way to denote a cross-wiki transclusion though...

See also T51097#7401266 which discusses transclusion of file description pages from commons.

daniel triaged this task as Medium priority.Jun 29 2022, 10:29 AM
daniel added a project: Epic.

Change 854018 had a related patch set uploaded (by MSantos; author: MSantos):

[mediawiki/core@master] WIP: follow redirects

https://gerrit.wikimedia.org/r/854018

Change 854018 merged by jenkins-bot:

[mediawiki/core@master] Follow redirects for page/{title} formats html/with_html

https://gerrit.wikimedia.org/r/854018

MSantos renamed this task from Core HTML REST API should follow redirects to Core Page REST API should follow redirects.Nov 21 2022, 4:34 PM
MSantos updated the task description. (Show Details)

Change 865668 had a related patch set uploaded (by MSantos; author: MSantos):

[mediawiki/core@master] WIP: add redirects to page/history and link endpoints

https://gerrit.wikimedia.org/r/865668

Change 865668 merged by jenkins-bot:

[mediawiki/core@master] add redirects to page/history and link endpoints

https://gerrit.wikimedia.org/r/865668