Page MenuHomePhabricator

Parsoid API should be closer to RESTBase API.
Closed, ResolvedPublic

Description

Parsoid's v2 API is gratuitously different from the RESTbase API in some minor ways. These should be fixed so that mediawiki-core doesn't continue to have to do silly transformations between URL schemes and results. In order to ease migration, this should probably be done as a new Parsoid "v3" API, after which we can deprecate and remove support for Parsoid's "v1" and "v2" APIs (T100681).

Changes to make:

  • the "body" parameter to the wt2html end point should be renamed "bodyOnly"
  • the "wt" format should be renamed "wikitext"
  • The result of the html2wt endpoint should be wikitext content (like in Parsoid's v1 API and RESTBase), not a JSON wrapper around the same.
  • When bodyOnly is true, we should serialize the *children* of the <body> element, not the <body> element itself.
  • We should consider supporting the /transform/X/to/Y route as an alias of the /Y route.
  • RESTBase uses /page/$format/ instead of just $format/ for GETs.

We should also consider reordering the version and the hostname, since we have:

https://$wgServerName/api/rest_v1/page/html/{TITLE} [VirtualRESTService]
http://parsoid-lb.eqiad.wikimedia.org/v2/$wgServerName/html/{TITLE} [Parsoid v2]
http://rest.wikimedia.org/$wgServerName/v1/page/html/{TITLE} [RESTbase v1]

Event Timeline

cscott raised the priority of this task from to Medium.
cscott updated the task description. (Show Details)
cscott added a project: Parsoid.
cscott subscribed.
cscott set Security to None.
cscott updated the task description. (Show Details)

I've got some patches here that disentangle the v1 and v2 APIs. After which, adding a third should be straightforward.

https://gerrit.wikimedia.org/r/#/c/219407/
https://gerrit.wikimedia.org/r/#/c/219508/

There is also a difference in the HTML the v2 API returns in the main pagebundle API. The pagebundle returns data-parsoid (and later data-mw) separately, so that it can be stored and later returned to Parsoid. Currently clients still support inline data-parsoid and data-mw, but I think it's pretty clear that we don't want to require them to do so in the longer term, especially once we move data-mw out.

It seems that we can simplify the life of our API users by focusing on packaging RESTBase for easy third-party installs and supporting private wikis. This would also remove the need for a v3 API with v1-like (inline data-*) functionality.

I think it's still worth removing gratuitous incompatibilities between the RESTBase and Parsoid APIs, if only because the RESTBase APIs are currently much better documented than the Parsoid APIs. And Parsoid certainly doesn't need multiple different APIs for the same task.

The changes I outlined in above seem worth doing, since Parsoid is never going to be able to support pagebundle functionality.

Change 233107 had a related patch set uploaded (by Cscott):
WIP: Implement Parsoid v3 API

https://gerrit.wikimedia.org/r/233107

Change 233107 merged by jenkins-bot:
Implement Parsoid v3 API

https://gerrit.wikimedia.org/r/233107

cscott claimed this task.

Done! And deployed!