Page MenuHomePhabricator

Disable support for custom offsetType and custom outputContentVersion major versions in Parsoid
Closed, ResolvedPublic

Description

As Parsoid is integrated into core, we want to eliminate options that aren't supported through ParserOptions (or cannot be supported as post-processing transforms) since ParserCache keys are computed based on ParserOptions used during a wt2html transform.

Two of those options are offsetType and outputContentVersion. These were supported by Parsoid/JS and have been supported by Parsoid/PHP and most recently by the REST API endpoints and functionality that are backed by ParsoidOutputAccess. ParsoidOutputAccess provides support for all these custom parser options since it uses its own ParserCache instance and handles all these options.

But, as part of T332931, we are now in the process of making ParsoidOutputAccess a thin wrapper over ParserOutputAccess (and eventually ParsoidOutputAccess may go away). ParserOutputAccess uses the ContentModel hierarchy and it is harder to shoe-horn custom options that aren't part of ParserOptions. So, we either need to make Parsoid's options a part of it OR eliminate their usage.

Looking at offsetType this option is only used to convert Parsoid's byte-based source offsets to UCS2 or char-based offsets. But, while this functionality exists, in practice, this is not used especially since all source offsets are primarily exposed as DSR values that are considered Parsoid-private information. So, it might be simpler to simply disable this support for not and revisit it if there is a real use case that needs non-byte offsets. There may also be possibilities to support this as a html2html transform (but it will require some code refactoring and cleanup). But, nothing that needs to be done now.

As for outputContentVersion, Parsoid does support rendering of HTML in multiple major content versions. In addition, it also has mechanisms to downgrade one major version to a lower major version. However, this support exists only when we bump major HTML version numbers and need to provide ways for clients to request an older version while they work to migrate to using a newer major version. In practice, we aren't likely to bump the major HTML version number any time soon. But, this support may still be needed in the future and at that point, we can re-institute the various pieces of https://www.mediawiki.org/wiki/API_versioning#Content_format_stability_and_negotiation that have either code-rotted or have been disabled. So, given that, for now, it may be simple to disable this support for now as well.

Event Timeline

Change 957801 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/core@master] Disable Parsoid support for non-default output versions and offset types

https://gerrit.wikimedia.org/r/957801

Change 957801 merged by jenkins-bot:

[mediawiki/core@master] Disable Parsoid support for non-default output versions and offset types

https://gerrit.wikimedia.org/r/957801

Change 961928 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] Update api-testing after features were disabled upstream

https://gerrit.wikimedia.org/r/961928

Ugh .. Parsoid's rt testing script is still node.js based which means its needs ucs2 offsets to run its syntactic / semantic diff classification. So, I'll have to revert the offsetType piece of this patch for now and handle that differently.

Change 961962 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/core@master] Revert offsetType disabling from 1aa71cf5: Parsoid's rt-testing needs it

https://gerrit.wikimedia.org/r/961962

Change 961962 merged by jenkins-bot:

[mediawiki/core@master] Revert offsetType disabling from 1aa71cf5: Parsoid's rt-testing needs it

https://gerrit.wikimedia.org/r/961962

Change 961928 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Update api-testing after features were disabled upstream

https://gerrit.wikimedia.org/r/961928

Change 962706 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.18.0-a26

https://gerrit.wikimedia.org/r/962706

Change 962706 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.18.0-a26

https://gerrit.wikimedia.org/r/962706