Page MenuHomePhabricator

Reroute RESTBase /page/lint/ endpoints to MediaWiki REST endpoints
Closed, ResolvedPublic3 Estimated Story Points

Description

Summary

Certain endpoints for handling Parsoid metadata and linter errors currently rely on RESTBase. These endpoints need to be rerouted to equivalent MediaWiki REST endpoints as part of the ongoing RESTBase sunset process. This will help standardize calls to the MediaWiki REST API.


Mapping of production URLs to be routed to MediaWiki-REST-API


Linter Errors

  • Get linter errors for a title
    • Current Endpoint: <domain>/api/rest_v1/page/lint/{title}
    • MW REST Endpoint: <domain>/w/rest.php/v1/page/{title}/lint
    • Details:
      • Parameter: {title}
      • Headers: Include x-restbase-compat with a value of true
      • Response: Linter errors in JSON
  • Get linter errors for a specific title/revision
    • Current Endpoint: <domain>/api/rest_v1/page/lint/{title}/{revision}
    • MW REST Endpoint: <domain>/w/rest.php/v1/revision/{id}/lint
    • Details:
      • Parameters: {title}, {revision}
      • Headers: Include x-restbase-compat with a value of true
      • Response: Linter errors in JSON

Additional Configuration

  • All forwarded calls must include the header x-restbase-compat: true to ensure RESTBase-compatible responses.
  • Ensure proper handling of redirects (301 and 302) and document behavior for query parameters like redirect=false.

Update on /page/data-parsoid

  • The /page/data-parsoid endpoint will be sunset and tracked at T393557

Acceptance Criteria

  • Rerouted endpoints are functional and return expected responses from the MW REST API.
  • The rerouting process is validated and approved by:
  • All calls are routed through REST Gateway for production use.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

This endpoint is used by DiscussionTools to validate posts.

I'd expect that to use a POST request to /transform/wikitext/to/lint, not a GET restquest to /page/lint/{title}. Am I wrong about that?

I don't actually know. I've added the discussion tools team for insight. I can't find any direct invocations of either of those endpoints via codesearch, so maybe I'm wrong and DT is using an internal call to lint and not a restbase endpoint. What does the user agent look like?

EDIT: above you say it's a probably a user script or gadget. That seems likely.

For the POST endpoint https://global-search.toolforge.org/?q=%2Fto%2Flint&regex=1&namespaces=&title= shows a number of hits (20) mostly on a script called ShowPageLintError.

For the GET endpoint we have about only 4 hits: https://global-search.toolforge.org/?q=%2Fpage%2Flint&regex=1&namespaces=&title=

As far as I can tell these are mostly folks talking about manual debugging strategies, but it's also possible that we're getting robots following these links.

The endpoints don't seem to always be equivalent. For example, a POST request to https://en.wikipedia.org/api/rest_v1/transform/wikitext/to/lint/User%20talk:Loganwasgood returns:

[{"type":"night-mode-unaware-background-color","dsr":[59,7185,47,2],"templateInfo":null,"params":[]},{"type":"night-mode-unaware-background-color","dsr":[241,2032,95,2],"templateInfo":null,"params":[]},{"type":"night-mode-unaware-background-color","dsr":[339,2029,213,6],"templateInfo":null,"params":[]},{"type":"night-mode-unaware-background-color","dsr":[2033,7182,47,2],"templateInfo":null,"params":[]},{"type":"night-mode-unaware-background-color","dsr":[2081,4771,111,0],"templateInfo":null,"params":[]},{"type":"night-mode-unaware-background-color","dsr":[2193,4771,83,2],"templateInfo":null,"params":[]},{"type":"night-mode-unaware-background-color","dsr":[4772,7179,114,0],"templateInfo":null,"params":[]},{"type":"night-mode-unaware-background-color","dsr":[4887,7179,83,2],"templateInfo":null,"params":[]}]

With https://en.wikipedia.org/w/rest.php/v1/transform/wikitext/to/lint/User%20talk:Loganwasgood, I get:

{"errorKey":"rest-no-match","messageTranslations":{"en":"The requested relative path (/v1/transform/wikitext/to/lint/User%20talk:Loganwasgood) did not match any known handler"},"httpCode":404,"httpReason":"Not Found"}

Looks like the endpoint isn't used a lot. We see about one request every five seconds, most of them coming from browsers at two IP addresses - so probably a niche gadget or user script.

I had previously switched my bot over to rest.php, but will probably switch back to restbase now given the above errors.

As far as I can see, wikitext/to/lint is simply not enabled in production.

@cscott should it be? Adding it to coreRoutes.json should Just Work...

Noting that we should run a test period for both of these endpoints before fully rerouting. See https://phabricator.wikimedia.org/T374683 for example and learnings about the need for test periods.

Based on this comment:

The target endpoint for data-parsoid metadata is yet to be determined.

It sounds like there is no alternative endpoint in MediaWiki yet. Is that accurate? @MSantos -- are you planning to create one? Or would you expect us to? Depending on the timeline there, we may want to manage the rerouting tasks separately instead of trying to bundle lint and data-parsoid together in a single request.

Related work to enable the Lint endpoint in production is here: https://phabricator.wikimedia.org/T388401

@HCoplin-WMF I believe it's better to split the tasks for now.

We still need to prioritise and investigate the work for the data-parsoid endpoint.

We still need to prioritise and investigate the work for the data-parsoid endpoint.

IIRC the Content Transform team said that we should not support data-parsoid as a stable interface for public APIs. So the plan was to deprecate it without replacement. Has that changed?

For reference, we are seeing about 1 request per minute for data-parsoid endpoints, and all I see in the past 30 days are hacking attempts: https://w.wiki/DP69

HCoplin-WMF renamed this task from Draft: Reroute RESTbase Endpoints for Parsoid Data and Linter Errors to MediaWiki REST Endpoints to Draft: Reroute RESTbase Endpoints for Linter Errors to MediaWiki REST Endpoints.Apr 7 2025, 6:10 PM
HCoplin-WMF triaged this task as Medium priority.
HCoplin-WMF updated the task description. (Show Details)

Updated this ticket to only cover lint endpoints.

Related ticket for parsoid data here, pending decision from Mateus about if we will deprecate or reroute: https://phabricator.wikimedia.org/T391277

HCoplin-WMF set the point value for this task to 3.Apr 17 2025, 3:45 PM
MSantos renamed this task from Draft: Reroute RESTbase Endpoints for Linter Errors to MediaWiki REST Endpoints to Reroute RESTbase Endpoints for Linter Errors to MediaWiki REST Endpoints.May 12 2025, 11:48 AM
MSantos updated the task description. (Show Details)

I've updated the state of /page/data-parsoid and we will track this endpoint work at T393557: Block external traffic to RESTBase /page/data-parsoid endpoint and investigate internal usage.

For the remainder of the task, I just want to record that I approve the next steps.

Has there been any further discussion on whether the restbase endpoints for linting are required for migration, and if so whether we'll be changing the existing usage pattern?

@hnowlan yes they are and I believe this is safe to switch-over. Any thoughts or concerns @daniel or @HCoplin-WMF?

From a quick look at the RESTBase metrics it looks like that even though we don't have traffic for:

  • POST transform/wikitext/to/html/<title>

But this is still getting traffic:

  • POST transform/wikitext/to/html

https://grafana.wikimedia.org/goto/R6m6K2fNg?orgId=1

There are also external hits for v1/page/lint/$title and v1/page/lint/$title/$revision against restbase. Do these need to be migrated and are they hit by alternate request paths?

Change #1151716 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[mediawiki/core@master] REST: Enable wikitext to lint transformations

https://gerrit.wikimedia.org/r/1151716

There are also external hits for v1/page/lint/$title and v1/page/lint/$title/$revision against restbase. Do these need to be migrated and are they hit by alternate request paths?

Isn't that just what /api/rest_v1/page/lint/{title} gets mapped to? Do you have an example of a full URL accessible from the outside?

There are also external hits for v1/page/lint/$title and v1/page/lint/$title/$revision against restbase. Do these need to be migrated and are they hit by alternate request paths?

Isn't that just what /api/rest_v1/page/lint/{title} gets mapped to? Do you have an example of a full URL accessible from the outside?

Yes, unless something unusual is happening. Just want to make sure we're consistent between the transform APIs and the page/lint pattern we're currently using.

HCoplin-WMF raised the priority of this task from Medium to High.Jun 17 2025, 3:23 PM

With https://en.wikipedia.org/w/rest.php/v1/transform/wikitext/to/lint/User%20talk:Loganwasgood, I get:

{"errorKey":"rest-no-match","messageTranslations":{"en":"The requested relative path (/v1/transform/wikitext/to/lint/User%20talk:Loganwasgood) did not match any known handler"},"httpCode":404,"httpReason":"Not Found"}

This works now after the latest patch went out.

I had previously switched my bot over to rest.php, but will probably switch back to restbase now given the above errors.

Can you try switching back to this endpoint and seeing if your bot works now?

Redirects for page/lint/{title} will not be handled if we route to transform/wikitext/to/lint/{title}, and I'm very weary of adding to query parameters to remapped URLs and making TransformHandler support them. It seems better to make a page/{title}/lint route in MW and reroute to that. The handler could be called PageLintHandler and work somewhat like PageHTMLHandler, sharing some helper classes. It would have to extend TransformHandler for now, with delegation in execute(). I've already tested this locally and it works fine.

Redirects for page/lint/{title}/{revision} will not be handled if we route to transform/wikitext/to/lint/{title}/{revision}, though I don't think the existing redirect behavior makes much sense. We already dropped that behavior from page/html/{title}/{revision} was migrated in T374683.* We don't really follow old revision redirects in MW, for example when using ?oldid with index.php. An old revision might redirect to missing or unintended page due pages being re-organized and renamed after that revision was made. It's immutable, so it can't be updated (and no one would bother even if it was).

  • If we similarly drop that behavior, then rerouting to transform/wikitext/to/lint/{title}/{revision} is fine. However, one difference would be that giving a title and a revision for a page with a different title would no longer 404. Lint errors for the revision content would be checked in the context of the given title.
  • If we want to keep it, then we can reroute to a revision/{id}/lint route with a RevisionLintHandler that does the redirect logic when x-restbase-compat is set. Similar to the /html/{title}/{revision} rerouting, the title in /lint/{title}/{revision} would be ignored, with the handler using the title of the revision. This would mean that using a title for a revision that belongs to a page with a different title would no longer 404. The client given title should always match the revision. If a client wanted to lint a revision in the context of a counterfactual title, they should be using /transform/.
  • One could also have RevisionLintHandler that gets the title from the rev, doesn't redirect, and remap to that. This best matches RevisionHTMLHandler and is a more sensible API.

*https://docs.google.com/spreadsheets/d/10FaxUcD6y4Xjss21HfXUwVsH98RCWO7Bs9hhZuDTfFg/edit?pli=1&gid=0#gid=0

aaron renamed this task from Reroute RESTbase Endpoints for Linter Errors to MediaWiki REST Endpoints to Reroute RESTbase /page/lint/ endpoints to MediaWiki REST endpoints.Sep 22 2025, 2:55 PM

I'm a little late to the party here, and trying to catch up. Bear with me as I repeat some things that have already been said, consolidating them from multiple sources.

To confirm, the google docs spreadsheet compares the "core" and "parsoid" versions of the endpoints, where "parsoid" is one of:

  • <domain>/api/rest_v1/page/lint/{title}
  • <domain>/api/rest_v1/page/lint/{title}/{revision}

And "core" is one of:

  • <domain>/w/rest.php/v1/transform/wikitext/to/lint/{title}
  • <domain>/w/rest.php/v1/transform/wikitext/to/lint/{title}/{revision}

The "core" versions in the spreadsheet assume that x-restbase-compat is present and true.

The biggest differences are in redirect behavior, where parsoid returns a 302 and, after the client follows that redirect, ends up linting the redirect target. Core, however, returns a 200 and lints the redirect page itself. That's a substantial change. To give examples:

Hitting https://en.wikipedia.org/api/rest_v1/page/lint/Obama results in:

Hitting https://en.wikipedia.org/w/rest.php/v1/transform/wikitext/to/lint/Obama results in:

  • 200 with an empty array as the response body
  • Presumably this is a linting of the redirect page and not a linting of the current revision of the Obama page

For the versions of the endpoints with revision ids, the behavior is similar:

Hitting https://en.wikipedia.org/api/rest_v1/page/lint/Obama/785244690 results in:

Hitting https://en.wikipedia.org/w/rest.php/v1/transform/wikitext/to/lint/Obama/785244690 results in:

  • 200 with an empty array as the response body
  • Presumably this is a linting of the redirect page and not a linting of the current revision of the Obama page

I'd very much like to not touch TransformHandler if we can avoid it. I agree with Aaron's suggestion of add PageLintHandler and RevisionLintHandler classes, and of generally making the API sensible in cases where the current behavior is or would be silly.

I'm a little late to the party here, and trying to catch up. Bear with me as I repeat some things that have already been said, consolidating them from multiple sources.

To confirm, the google docs spreadsheet compares the "core" and "parsoid" versions of the endpoints, where "parsoid" is one of:

  • <domain>/api/rest_v1/page/lint/{title}
  • <domain>/api/rest_v1/page/lint/{title}/{revision}

And "core" is one of:

  • <domain>/w/rest.php/v1/transform/wikitext/to/lint/{title}
  • <domain>/w/rest.php/v1/transform/wikitext/to/lint/{title}/{revision}

The spreadsheet is for the /html/ endpoints that were already migrated. My plan is to follow the same pattern as them in terms of:

  • Using 307 redirect codes instead of 300.
  • The removal of redirection for the {title}/{revision} endpoint.
  • Ignoring {title} and using the revision title, thus the removal of "404 if {revision} does not match {title}" behavior.

Change #1190368 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] [WIP] Add 'v1/page/{title}/lint' and 'v1/page/{title}/lint" routes

https://gerrit.wikimedia.org/r/1190368

Change #1190368 merged by jenkins-bot:

[mediawiki/core@master] Add 'v1/page/{title}/lint' and 'v1/revision/{id}/lint" routes

https://gerrit.wikimedia.org/r/1190368

Change #1197731 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/deployment-charts@master] Update /page/ lint routes to use the new rest.php endpoints

https://gerrit.wikimedia.org/r/1197731

Krinkle renamed this task from Reroute RESTbase /page/lint/ endpoints to MediaWiki REST endpoints to Reroute RESTBase /page/lint/ endpoints to MediaWiki REST endpoints.Oct 21 2025, 11:27 PM
Krinkle updated the task description. (Show Details)

Change #1197731 merged by jenkins-bot:

[operations/deployment-charts@master] Update /page/ lint routes to use the new rest.php endpoints

https://gerrit.wikimedia.org/r/1197731

Change #1199032 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/puppet@production] Route /page/lint(.*) to the gateway on test2wiki

https://gerrit.wikimedia.org/r/1199032

Change #1199033 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/puppet@production] Route /page/lint(.*) to the gateway on group0

https://gerrit.wikimedia.org/r/1199033

Change #1199034 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/puppet@production] Route /page/lint(.*) to the gateway on group1

https://gerrit.wikimedia.org/r/1199034

Change #1199035 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[operations/puppet@production] Route /page/lint(.*) to the gateway on all wikis

https://gerrit.wikimedia.org/r/1199035

Change #1199032 merged by Clรฉment Goubert:

[operations/puppet@production] Route /page/lint(.*) to the gateway on test2wiki

https://gerrit.wikimedia.org/r/1199032

Change #1199033 merged by Clรฉment Goubert:

[operations/puppet@production] Route /page/lint(.*) to the gateway on group0

https://gerrit.wikimedia.org/r/1199033

Change #1199034 merged by Hnowlan:

[operations/puppet@production] trafficserver: Route group1 /page/lint(.*) to the rest-gateway

https://gerrit.wikimedia.org/r/1199034

Change #1199035 merged by Clรฉment Goubert:

[operations/puppet@production] Route /page/lint(.*) to the gateway on all wikis

https://gerrit.wikimedia.org/r/1199035