Page MenuHomePhabricator

Serializer should use Frame, not SelserData
Open, MediumPublic

Description

If the serializer is ever recursive -- that is, we can recurse into an extension tag or template with a different source text, like <gallery> or <poem> would do -- then we need to use the same Frame abstraction currently used in wt2html (both in the core parser and Parsoid) in the html2wt path. The Frame holds the information about the current "recursion" -- in particular the source text being used (for parsing or for selser) as well as (in the core parser) some information about the arguments and misc state of the parser being used. See includes/parser/PPFrame in core -- this is also used as a cache key when caching expansions.

This "should" be easy -- we just need to add a reference to the frame to SelserData or replace SelserData with a reference to the frame, and eventually bring SerializerState::getOrigSrc() closer to WTUtils::getWTSource() by getting the original text from the frame instead of from SelserData.

Event Timeline

Change 540620 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Use SourceRange objects in Html2Wt serializer code

https://gerrit.wikimedia.org/r/540620

ssastry triaged this task as Medium priority.Mar 8 2020, 1:01 AM