Page MenuHomePhabricator

Parsoid sometimes nests the annotation tags when round-tripping wikitext
Closed, ResolvedPublicBUG REPORT

Description

Small reproducer:

$ echo -e '<translate>\ntext1</translate><translate>\ntext2</translate>' |php bin/parse.php --wt2wt
<translate>
text1<translate>
text2</translate>
</translate>

We'd expect

<translate>
text1</translate><translate>
text2</translate>

as a result of this parsing.

In particular in the context of the <translate> annotation, nesting annotations is not allowed; trying to save this page after modification would (depending on the edit) display error messages.

This issue can also be shown by the following example:

$ echo "a ''<translate>b'' c ''d </translate> e <translate>f</translate>'' g" |/usr/bin/php /home/isa-wmf/gitrepo/parsoid/bin/parse.php --wt2wt
a <translate>''b'' c ''d  e <translate>f</translate>''</translate> g

which may be useful for a reproducer that doesn't get confusing around paragraph limits.

Event Timeline

Arlolra triaged this task as Medium priority.Nov 22 2021, 6:02 PM
Arlolra moved this task from Needs Triage to Bugs & Crashers on the Parsoid board.

Change 740914 had a related patch set uploaded (by Isabelle Hurbain-Palatin; author: Isabelle Hurbain-Palatin):

[mediawiki/services/parsoid@master] Don't generate nested annotated ranges in HTML output

https://gerrit.wikimedia.org/r/740914

Change 740914 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Don't generate nested annotated ranges in HTML output

https://gerrit.wikimedia.org/r/740914

Change 742570 had a related patch set uploaded (by Sbailey; author: Sbailey):

[mediawiki/vendor@master] Bump Parsoid to 0.15.0-a11

https://gerrit.wikimedia.org/r/742570

Change 742570 merged by jenkins-bot:

[mediawiki/vendor@master] Bump Parsoid to 0.15.0-a11

https://gerrit.wikimedia.org/r/742570