Page MenuHomePhabricator

Templates & wikilink-in-extlink scenarios: Parsoid breaks about-id continuity
Closed, ResolvedPublic

Description

We got a bug report on German WP, starting off with this diff. Apparently, while the user was editing something else, VE manipulated some of the references. The second one in the example diff contains the web archive template, which in this case was erroneously used (the text that is automatically linked to the archive contained a wikilink [template enS]; classical case of URL–wikilink conflict). VE seems to stumble upon this template/link nesting and basically doubles the references with a substed version of the web archive template, adding a lot of span tags, all inside of the refs. I have meanwhile fixed the wrong use of the template, but why does the VE act like this (I could easily reproduce the behaviour)? IMO, VE should not do anything, since the fix in this case can only come from the user (and doubling the reference only makes it worse, after all).

Event Timeline

JTannerWMF subscribed.

We will investigate this ticket

matmarex subscribed.

This seems to be a Parsoid issue. Here's a reduced test case: https://de.wikipedia.org/w/index.php?title=Benutzer:Matma_Rex/sandbox&diff=184320074&oldid=184320059&diffmode=source

Parsoid rendering of the previous revision: https://de.wikipedia.org/api/rest_v1/page/html/Benutzer%3AMatma_Rex%2Fsandbox/184320059

<p id="mwAg">Heath Lowry: <a rel="mw:ExtLink" class="external text" href="https://web.archive.org/web/20050106081407/http://www.tetedeturc.com/home/rubrique.php3?id_rubrique=27#sommaire" about="#mwt1" typeof="mw:Transclusion" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;Webarchiv &quot;,&quot;href&quot;:&quot;./Vorlage:Webarchiv&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;http://www.tetedeturc.com/home/rubrique.php3?id_rubrique=27#sommaire&quot;},&quot;wayback&quot;:{&quot;wt&quot;:&quot;20050106081407&quot;},&quot;text&quot;:{&quot;wt&quot;:&quot;{{enS|‘‘The story behind Ambassador Morgenthau’s Story‘‘}}&quot;}},&quot;i&quot;:0}}]}" id="mwAw"><span style="font-style:normal;font-weight:normal" about="#mwt1"></span></a><a rel="mw:WikiLink" href="./Englische_Sprache" title="Englische Sprache" id="mwBA">englisch</a><span about="#mwt1" id="mwBQ"> </span><span lang="en-Latn" style="font-style:italic" about="#mwt1" id="mwBg">‘‘The story behind Ambassador Morgenthau’s Story‘‘</span><span about="#mwt1" id="mwBw"> (</span><a rel="mw:WikiLink" href="./Web-Archivierung#Begriffsbestimmung" title="Web-Archivierung" about="#mwt1" id="mwCA">Memento</a><span about="#mwt1" id="mwCQ">  vom 6. Januar 2005 im </span><i about="#mwt1" id="mwCg"><a rel="mw:WikiLink" href="./Internet_Archive" title="Internet Archive" id="mwCw">Internet Archive</a></i><span about="#mwt1" id="mwDA">)</span><span style="display:none;" about="#mwt1" id="mwDQ"></span></p>

image.png (255×995 px, 63 KB)

The about="#mwt1" attribute (which indicates that this element is generated by a template) is missing on some of the elements. This causes VE to treat them as if they were normal text rather than part of the template, which in turn causes them to be duplicated when saving.

Ya, using wikilinks in extlinks can cause this - work has begun on a new linter category T202905: Outreach-17 Project: Add a new Linter Category: Links-in-Links to identify these usages and have editors fix them.

But, Parsoid tries to avoid dirty diffs for them, but looks like in this case where a template is causing this, it appears that Parsoid is not doing a good enough job. We'll investigate this issue.

ssastry renamed this task from VE doubles reference with URL–wikilink conflict to Templates & wikilink-in-extlink scenarios: Parsoid breaks about-id continuity.Apr 22 2019, 2:01 PM
ssastry triaged this task as Medium priority.
ssastry removed subscribers: matmarex, JTannerWMF.
ssastry claimed this task.

This may have been fixed since then. https://www.mediawiki.org/w/index.php?title=User:SSastry_(WMF)/Sandbox&type=revision&diff=3847352&oldid=3847351&diffmode=source I tried hard to break this locally on the commandline and couldn't -- so presumably this is no longer an issue and in any case, we also have the linter category to flag this to editors. So, I am going to close this and if you find a reproducible case, feel free to reopen.