Page MenuHomePhabricator

Handling extension tags in HTML attributes (edge cases or otherwise)
Open, LowPublic

Description

T259676 is a production crasher. While https://gerrit.wikimedia.org/r/618565 fixes it, @Arlolra raised the question of why the extsrc property is missing in the first place.

Turns out that the wikitext in question is:

<i <ref>a</ref>>...

...</i>

The HTML i-tag is separate across a newline which breaks it across paragraph boundaries and then fixed by the tree builder which duplicates the HTML attribute which happens to contain the ref-tag.

To be clear, this looks like broken wikitext and so doesn't merit a lot of attention on its own. But in terms of consistent handling of scenarios like these, there are two questions to answer here:

  1. What is a sensible way to handle extension tags in HTML attribute positions? Typed templates / typed wikitext offers a clear strategy in the future ( i.e. enforce output constraints based on embedding context), but we need a solution before we get there.
  2. How we do handle tree builder fixup and HTML attributes of this nature?

I'll include a transcript of IRC conversation in a comment below but that conversation effectively raises the above 2 questions.

Event Timeline

ssastry renamed this task from Handling extension tags in HTML attributes (edge cases or oherwise) to Handling extension tags in HTML attributes (edge cases or otherwise).Aug 10 2020, 6:41 PM
ssastry triaged this task as Low priority.
ssastry moved this task from Needs Triage to Tech Debt / Big changes on the Parsoid board.

Change 654717 had a related patch set uploaded (by Arlolra; owner: Arlolra):
[mediawiki/services/parsoid@master] More papering over in References.php

https://gerrit.wikimedia.org/r/654717

Change 654717 merged by jenkins-bot:
[mediawiki/services/parsoid@master] More papering over in References.php

https://gerrit.wikimedia.org/r/654717

Change 655482 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.13.0-a22

https://gerrit.wikimedia.org/r/655482

Change 655482 merged by jenkins-bot:
[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.13.0-a22

https://gerrit.wikimedia.org/r/655482