Page MenuHomePhabricator

Move HtmlHolder from Parsoid to independent library
Open, Needs TriagePublic

Description

We decided to prototype the HtmlHolder interface in Parsoid rather than a separate library, since for reasons discussed in https://www.mediawiki.org/wiki/Parsoid/OutputTransform/HtmlHolder the initial implementation will be tightly tied to Parsoid for serialization and deserialization. There are two main mechanisms we can use to decouple HtmlHolder:

  • Create a hook mechanism allowing the use of a "Parsoid-aware" HtmlHolder without including it in the generic library
  • Further work on T348165: Parsoid Rich Attributes phase 3 allowing proper serialization/deserialization of structed-data attributeswithout any Parsoid-specific knowledge

The relevant code for this is included in the DOMDataUtils class in Parsoid.

In addition, a subset of the DOMCompat and DOMUtils classes from Parsoid will be moved to the library in order to make it a generally-useful "html DOM manipulation in PHP" library. Some code from HtmlFormatter (T258964, T255586, T217360, T185726) might eventually be included as well.