Page MenuHomePhabricator

Content duplicated when editing when a block template is used in a wikilink
Closed, DuplicatePublic

Description

Spotted by @ssastry when reviewing reply tool edits on dtcheck. Copying from Slack:

FYI: https://fr.wikipedia.org/w/index.php?title=Wikip%C3%A9dia%3ALe_Bistro%2F20_f%C3%A9vrier_2021&type=revision&diff=180175011&oldid=180167416

https://fr.wikipedia.org/w/index.php?title=Wikip%C3%A9dia%3ALe_Bistro%2F20_f%C3%A9vrier_2021&type=revision&diff=180178038&oldid=180176058 ... there is no obvious fix for this beyond linting and fixing the wikitext, fyi. I may go ahead and tweak that wikitext.

https://fr.wikipedia.org/w/index.php?title=Wikip%C3%A9dia%3ALe_Bistro%2F20_f%C3%A9vrier_2021&type=revision&diff=180178513&oldid=180178397

We might need to add more linting. But if someone feels bold and removes the {{center..}} from the wikilink, the corruption will be stopped till we figure out how to handle that in Parsoid. But, the net effect is that the HTML5 tree builder splits the wikilink into 2 pieces because of the div-in-a nesting and selser then dutifully serializes both the pieces causing duplication.

Event Timeline

HTML5 tree builder actually allows <div> inside <a> – but it does not allow <div> inside <p>:

InputOutput
<p>X<div>Y</div></p><p>X</p><div>Y</div><p></p>
<p><a href="http://example.com">X<div>Y</div></a></p><p><a href="http://example.com">X</a></p><div><a href="http://example.com">Y</a></div><p></p>

This feels very similar to the link-inside-link scenario (T150196), and I know that Parsoid is able to roundtrip that one (although I'm not sure how it does it). Is the trick you use for that not applicable here?

Subbu wrote in IRC:

[[Bar|<div>x</div>]] is the test case.

It is tricky though becuase ... as bartosz pointed out, the problem is not div-in-link ... it is the p-wrapper around the link that cause the parapraph to split before the div.
so, that syntax is valid in a table cell, for ex.
or valid inside a <div>

Sbailey added a subscriber: Sbailey.

Given that we have run into this for the very first time after so many years, I am marking it low as being an edge case. But we'll fix it.

This feels very similar to the link-inside-link scenario (T150196), and I know that Parsoid is able to roundtrip that one (although I'm not sure how it does it). Is the trick you use for that not applicable here?

That works because Parsoid explicitly handles it via this complex logic. The bug is an instance of T165098: unpackDOMFragments doesn't handle all mandatory content model constraints (those that are enforced by the HTML5 tree builder on a HTML5 parse) into which I am going to merge this task next. :-)

Looks like this was the 4th instance of this bug actually! So, maybe we ought to consider fixing this one of these days.