Page MenuHomePhabricator

REST: define the content.v1 module that will replace Parsoid endpoints.
Closed, ResolvedPublic

Description

Define a REST module with the name "content.v1" (see T368461), so clients that are currently using the parsoid endpoints exposed by RESTbase can be migrated (see T334238). The endpoints that will be exposed under content.v1 already exist under the global v1 prefix. They will be accessible from both locations from now. We expect to deprecate access using the global v1 prefix in the future, and make the old paths redirect to the new paths.

Requirements

The new endpoints will be exposed under:

  • /api/content.v1/page/{title}
  • /api/content.v1/page/{title}/with_html
  • /api/content.v1/page/{title}/html
  • /api/content.v1/page/{title}/bare
  • /api/content.v1/revision/{id}
  • /api/content.v1/revision/{id}/with_html
  • /api/content.v1/revision/{id}/html
  • /api/content.v1/revision/{id}/bare

We also support POST and PUT for creating/updating pages:

  • POST /api/content.v1/page
  • PUT /api/content.v1/page/{title}

The new paths for these endpoints can be defined by creating a module definition file (T362480).
The relevant Mocha tests should be updated to run for both prefixes.

For the work needed to make the /api/ prefix usable for MediaWiki REST endpoints see (T364400). This is out of scope for this task.

Context

We are planning to deprecate the Parsoid endpoints that are currently exposed through RESTbase (T334238). Clients should use the equivalent core endpoints instead. In order to support this, we want to make the URLs of the core endpoints more future proof.

The old parsoid endpoints on RESTbase are:

  • /api/rest_v1/page/title/{title}
  • /api/rest_v1/page/title/{title}/{revision}
  • /api/rest_v1/page/html/{title}
  • /api/rest_v1/page/html/{title}/{revision}

These endpoints are functionally equivalent but structurally different. The endpoints for retrieving HTML without a JSON wrapper are nearly exactly the same, they only differ in the handling of wiki redirects. These two are similar enough to their new equivalents that we will start redirecting from them to the new ones once most of the client traffic has migrated:

  • /api/rest_v1/page/html/{title} -> /api/content.v1/page/{title}/html
  • /api/rest_v1/page/html/{title}/{revision} -> /api/content.v1/revision/{revision}/html

Event Timeline

BPirkle updated the task description. (Show Details)

PageSourceHandler and RevisionSourceHandler both include constructHtmlUrl() functions that build a url using a (currently) hard-coded old-style path.

These should be reconsidered as part of this change.

Change #1058287 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/core@master] Add content.v1 module, to include relevant content endpoints

https://gerrit.wikimedia.org/r/1058287

BPirkle changed the task status from Open to In Progress.Jul 31 2024, 3:20 AM

Hrm, we have a bit of a chicken-and-egg problem here.

The module itself is defined by a module file in core. It is activated by a corresponding config change, to add the module file to RestAPIAdditionalRouteFiles.

The mocha tests, which need to accompany the endpoint definitions, will execute as soon as they're pushed. So if we try to push the code before the config, the mocha tests will fail, because the new endpoints with the module path aren't working yet.

But if we push the config first, then the REST infrastructure will be looking for a module file that doesn't exist and (at least per local testing) all sorts of sadness will occur.

Three ideas occur:

  1. make the REST system happy if a module file is defined but doesn't actually exist. Just silently do nothing. I don't like this, because normally that'd be an error worthy of at least logging.
  1. make the mocha tests context-aware (somehow) so that they don't test inactive modules. I don't like this either. Adding that sort of complexity to tests is fraught with peril.
  1. do a multi-stage release. First, release an empty module file that defines no endpoints (and therefore needs no tests). Then release the config to use that module file, which will have no immediate effect, but won't hurt anything. Then finally, release the real module file along with the mocha tests. Because the config is already in place, the mocha tests will pass.

#3 is my current favorite, but I'm still thinking about whether there's a better solution.

Change #1058715 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/core@master] Empty module file in preparation for a content.v1 REST module

https://gerrit.wikimedia.org/r/1058715

Another thing that comes to mind is to automatically load core REST modules from a specified directory (/includes/Rest/modules or wherever) rather than defining them in a config var. Then no separate config change would be needed, and the module would auto-activate with the core code push.

Change #1059124 had a related patch set uploaded (by BPirkle; author: BPirkle):

[operations/mediawiki-config@master] Add content.v1 REST module

https://gerrit.wikimedia.org/r/1059124

Change #1059159 had a related patch set uploaded (by BPirkle; author: BPirkle):

[mediawiki/core@master] Load REST module definitions from a ModuleFiles directory

https://gerrit.wikimedia.org/r/1059159

I've pushed several changes, for discussion of alternatives. We won't merge all of them. At least one will be abandoned.

We could either merge these three patches (in this order):

  1. https://gerrit.wikimedia.org/r/1058715 (core change to add module file with no routes)
  2. https://gerrit.wikimedia.org/r/1059124 (config change to activate module)
  3. https://gerrit.wikimedia.org/r/1058287 (core change to push real module file and associated changes and tests)

OR merge these two patches, in this order:

  1. https://gerrit.wikimedia.org/r/1059159 (load modules from a dedicated directory)
  2. https://gerrit.wikimedia.org/r/1058287 (core change to push real module file and associated changes and tests)

If we do the first option, I'd like feedback on, well everything, but in particular the naming convention for the module definition file. Because this is the first one in core, it will establish the pattern.

If we do the second option, I'd like feedback in particular on the directory name. That patch includes a module file with no routes, just because git doesn't like empty directories. I'm not sure that makes for a clean patch, but I needed to put something there. Alternatively, I could put a README that we'd either keep (this would mean making the directory iteration code smarter, to skip it) or delete when we pushed the first real module.

I have not yet explored the option of making mocha tests smart enough to only test active modules. There was one proponent of that approach in a synchronous discussion on this topic. I'll try to look into that a bit more to see if I can find a reasonable way to implement that as another possible option.

I have not yet explored the option of making mocha tests smart enough to only test active modules. There was one proponent of that approach in a synchronous discussion on this topic. I'll try to look into that a bit more to see if I can find a reasonable way to implement that as another possible option.

One way of doing this would be to complete T365753: REST: expose a machine readable directory of available API modules, then modify the mocha tests to run against only the intersection of the paths defined in the test, and the available modules. Then we could cleanly push new modules without failing tests, because tests for the new module wouldn't run until the module was activated in config.

Hrm, we have a bit of a chicken-and-egg problem here.

The module itself is defined by a module file in core. It is activated by a corresponding config change, to add the module file to RestAPIAdditionalRouteFiles.

I think modules defined by core shoujld be registered in the same way that we have been using for regisering the coreRoutes.json file. Currently, that's simply by hard-coding it in EntryPoint.php. The idea is that this is not an additional route file, it's a core route file. It's not optional or configurable, it's required to be there.

Putting all such route files into a well known directory and just loading everything in that directory would also work, but it requires us to iterate over a directory for every request. We'd want to cache the outcome and check against the directory's mtime. I'd consider that nice to have for now.

I think modules defined by core shoujld be registered in the same way that we have been using for regisering the coreRoutes.json file. Currently, that's simply by hard-coding it in EntryPoint.php.

I'm good with that. It leaves me wondering if the way we have the config variable set up is really what we want. But we can worry about that later. I'll revise the current patch accordingly.

Change #1058715 abandoned by BPirkle:

[mediawiki/core@master] Empty module file in preparation for a content.v1 REST module

Reason:

Abandoning in favor of hard-coded module file name. See https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1058287

https://gerrit.wikimedia.org/r/1058715

Change #1059124 abandoned by BPirkle:

[operations/mediawiki-config@master] Add content.v1 REST module

Reason:

Abandoning in favor of hard-coded module file name. See https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1058287

https://gerrit.wikimedia.org/r/1059124

Change #1059159 abandoned by BPirkle:

[mediawiki/core@master] Load REST module definitions from a ModuleFiles directory

Reason:

Abandoning in favor of hard-coded module file name. See https://gerrit.wikimedia.org/r/c/mediawiki/core/+/1058287

https://gerrit.wikimedia.org/r/1059159

Change #1058287 merged by jenkins-bot:

[mediawiki/core@master] Add content.v1 REST module with relevant content endpoints

https://gerrit.wikimedia.org/r/1058287

BPirkle moved this task from In Progress to Done on the MW-Interfaces-Team board.