Generalize foster parented content detection in early DOM postprocessor pass
Many DOM passes depend on an accurate identification of foster-parented content (see We have implemented some detection already in dom.markFosteredContent.js, but also still depend on a hack (used to be a convenient bug) in the HTML5 treebuilder that disables fostering for meta tags.

It would be great if we could generalize and improve the existing algorithm so that

  • it can be run as a first pass on the DOM,
  • its marking of fostered content can be relied upon by all other DOM passes,
  • it properly detects fostering of pure text, and
  • we can remove the no-fostering-for-metas hack from the HTML5 treebuilder.

Fostered text detection can probably be addressed with this trick:

For each <table> TagTk, we can pre-pend a <meta typeof="mw:FosterMarker"> SelfclosingTagTk just before adding a tagId sequence number and feeding those tokens to the treebuilder. This will then create a 'fostering box' in the DOM:

<meta typeof="mw:FosterBox" data-parsoid="{tagId: 3}"/>
potentially fostered content
<table data-parsoid="{tagId: 4}">..</table>

Fostered element content will have higher tagIds than both the meta and the table.

A complication we should ignore for now is cases like <table><meta><table>..- lets tackle those rare edge cases later.

The goal is to mark all fostered content with data.parsoid.fostered. Fostered text nodes need to be wrapped into a span for this. The extra meta tags for fostering detection should be stripped so that they don't interfere with later passes.

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:03 AM
bzimport added a project: Parsoid-DOM.
bzimport set Reference to bz53110.

A clarification: The hack (convenient bug) in the HTML5 treebuilder that disables fostering of meta tags is used for a different pass (markTreeBuilderFixups) and is independent of the task in this bug -- which is accurate detection of fostered tags. Consequently, step 4. (we can remove no-fostering-for-metas hack) can be implemented separately from the task here -- we can create a new bug for it and outline the problems and requirements there.

The fourth step (re-enabling foster-parenting) will definitely require more work than just implementing fostering detection, but it should be significantly easier once reliable fostering info is available. Lets create a separate bug for that once we get close to tackling it.

Change 80675 had a related patch set uploaded by Arlolra:
WIP: Generalize foster parented content detection

Change 80675 merged by jenkins-bot:
Generalize foster parented content detection