Page MenuHomePhabricator

Refactor both directions of ConvertOffsets to be consistent
Open, MediumPublic

Description

See comments on https://gerrit.wikimedia.org/r/563597:

The issue is that currently we do the ContentUtils helper "manually" in the prologue code which sets up the input DOM (around the same place we move the data-* attributes into the bag, etc).

But then ConvertOffsets::run() is executed "automatically" as part of the DOMPostProcessor framework.

In the html2html case we're doing "some parts of DOMPostProcessor" manually. We we manually set up the input DOM (that looks the same as before) but we also have to manually run the part of the DOMPostProcessor pipeline where we convert the output offsets. That looks asymmetric.

One option would be to futz with DOMPostProcessor so that it is clearer that we are executing a subsection of the DOMPostProcessor pipeline. That would expose the underlying abstraction more clearly, and make it easier/cleaner to expose different parts of the DOMPostProcessor pipeline to html2html, if we ever wanted to do that.

The other option is just to encapsulate both the input and output sides of ConvertOffsets better, either in ContentUtils or a standalone class. Then this code would look symmetric: ConvertOffsets::convertFromInputType() / ConvertOffsets::convertToOutputType() and we'd probably change DOMPostProcessor so that instead of invoking a standalone Parsoid\Wt2Html\PP\Processors class (we should really have an interface marking these, with a declaration of the run() method) it had an action thunk (like the cleanup handler does) which just invokes the appropriate ConvertOffsets::convertToOutputType() method).

I think I favor the latter, although wrt the former it *would* be nice to better structure the DOMPostProcessor pipelines: there are already quite lengthy comments there about some of the subtleties wrt ordering and non-obvious prerequisites and non-idempotence, etc. There's a bunch of different types and invocation methods and optional properties and other things in that longish list as well that could use some cleanup ... but I don't feel like right this second is the time to do it.

@ssastry responded:

DOMPostProcessor already has the ability to accept a custom list of processors. See registerProcessors in there and its use in PHPUnit tests.

So either the first approach or the second approach could use used to clean up what is now a manual invocation of one piece of a the post processor pipeline in Parsoid::html2html.