Page MenuHomePhabricator

mw-empty-elt class isn't always added to empty p-tags in Parsoid
Closed, ResolvedPublic

Description

In some cases, Parsoid's mw-empty-elt tagging of empty p-tags runs too soon. This seems to show up a lot on frwiki pages. Look at Legacy vs Parosid There is an extra newline in the Parsoid output. If you inspect the relevant HTML, you will see that on the legacy side, you have the "mw-empty-elt" class added to the empty p-tag but not in Parsoid.

Turns out this is because Parsoid's 'handleEmptyElt' and 'finalCleanup' handlers run in the same DOM pass which runs as an in-order traversal. So, the handleEmptyElt runs on a p-tag before finalCleanup deletes elements in a p-tag making it empty. Even if you reorder the handlers, it doesn't work because of the in-order traversal. The p-tag is processed (and detected not-empty) before the finalCleanup processes the p-tag's children.

So, two solutions:
(a) Introduce post-order traversal for some DOM passes -- needs careful thinking how this works in the face of mutations
(b) Split finalCleanup and handleEmptyElt into two passes so the ordering works out

This is a source of a lot of minor whitespace diffs and visual diffing noise on frwiki pages (and not sure which other wikis)

Event Timeline

Change 984628 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[integration/visualdiff@master] Remove empty-p-tag adaptor code; Parsoid will fix this in T353821

https://gerrit.wikimedia.org/r/984628

Change 984628 merged by Arlolra:

[integration/visualdiff@master] Remove empty-p-tag adaptor code; Parsoid will fix this in T353821

https://gerrit.wikimedia.org/r/984628

The analysis in the description is not entirely right. Turns out that the handle empty elements pass was not setting mw-empty-elt class for template output wrappers!

ssastry triaged this task as Medium priority.

Change 1003581 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Ensure template wrappers get mw-empty-elt class

https://gerrit.wikimedia.org/r/1003581

Change 1003581 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Ensure template wrappers get mw-empty-elt class

https://gerrit.wikimedia.org/r/1003581

Change 1004623 had a related patch set uploaded (by Isabelle Hurbain-Palatin; author: Isabelle Hurbain-Palatin):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.19.0-a18

https://gerrit.wikimedia.org/r/1004623

Change 1004623 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.19.0-a18

https://gerrit.wikimedia.org/r/1004623