Page MenuHomePhabricator

Line based p-wrapping can't match Remex
Closed, DuplicatePublic

Description

The php parser does p-wrapping in two ways: BlockLevelPass does line based wrapping and then Remex does it on SAX events to p-wrap unwrapped text that the php parser skipped because of the idiosyncrasies of the block level pass (See T134469: doBlockLevels() inserts <p> and </p> randomly with no regard for HTML validity)

Parsoid's p-wrapper matches the line based wrapping pretty faithfully, but also tries to do Remex's top level pass on the line using firstBlockTokenType as a heuristic. The latter should be moved to a DOM pass where there's better insight for when we're in a block.

See the FIXME added in https://gerrit.wikimedia.org/r/#/c/mediawiki/services/parsoid/+/436847/ for T194806

Event Timeline

Arlolra created this task.Jun 26 2018, 2:26 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 26 2018, 2:26 PM
Arlolra triaged this task as Normal priority.Jun 26 2018, 2:26 PM
ssastry updated the task description. (Show Details)EditedJun 26 2018, 3:36 PM
ssastry added a subscriber: ssastry.

This ticket might actually be a duplicate of the older one.

Indeed it is.