We got a bug report on German WP, starting off with this diff. Apparently, while the user was editing something else, VE manipulated some of the references. The second one in the example diff contains the web archive template, which in this case was erroneously used (the text that is automatically linked to the archive contained a wikilink [template enS]; classical case of URL–wikilink conflict). VE seems to stumble upon this template/link nesting and basically doubles the references with a substed version of the web archive template, adding a lot of span tags, all inside of the refs. I have meanwhile fixed the wrong use of the template, but why does the VE act like this (I could easily reproduce the behaviour)? IMO, VE should not do anything, since the fix in this case can only come from the user (and doubling the reference only makes it worse, after all).
Description
Related Objects
- Mentioned In
- T242336: Using VE to add a link to text that includes a template with <ref>, adds an unnecessary HTML blob
T209493: VE is transforming citation templates into formatted text with "cite class" tags (when copying a reference defined within template-generated reflist) - Mentioned Here
- T202905: Outreach-17 Project: Add a new Linter Category: Links-in-Links
Event Timeline
This seems to be a Parsoid issue. Here's a reduced test case: https://de.wikipedia.org/w/index.php?title=Benutzer:Matma_Rex/sandbox&diff=184320074&oldid=184320059&diffmode=source
Parsoid rendering of the previous revision: https://de.wikipedia.org/api/rest_v1/page/html/Benutzer%3AMatma_Rex%2Fsandbox/184320059
<p id="mwAg">Heath Lowry: <a rel="mw:ExtLink" class="external text" href="https://web.archive.org/web/20050106081407/http://www.tetedeturc.com/home/rubrique.php3?id_rubrique=27#sommaire" about="#mwt1" typeof="mw:Transclusion" data-mw="{"parts":[{"template":{"target":{"wt":"Webarchiv ","href":"./Vorlage:Webarchiv"},"params":{"url":{"wt":"http://www.tetedeturc.com/home/rubrique.php3?id_rubrique=27#sommaire"},"wayback":{"wt":"20050106081407"},"text":{"wt":"{{enS|‘‘The story behind Ambassador Morgenthau’s Story‘‘}}"}},"i":0}}]}" id="mwAw"><span style="font-style:normal;font-weight:normal" about="#mwt1"></span></a><a rel="mw:WikiLink" href="./Englische_Sprache" title="Englische Sprache" id="mwBA">englisch</a><span about="#mwt1" id="mwBQ"> </span><span lang="en-Latn" style="font-style:italic" about="#mwt1" id="mwBg">‘‘The story behind Ambassador Morgenthau’s Story‘‘</span><span about="#mwt1" id="mwBw"> (</span><a rel="mw:WikiLink" href="./Web-Archivierung#Begriffsbestimmung" title="Web-Archivierung" about="#mwt1" id="mwCA">Memento</a><span about="#mwt1" id="mwCQ"> vom 6. Januar 2005 im </span><i about="#mwt1" id="mwCg"><a rel="mw:WikiLink" href="./Internet_Archive" title="Internet Archive" id="mwCw">Internet Archive</a></i><span about="#mwt1" id="mwDA">)</span><span style="display:none;" about="#mwt1" id="mwDQ"></span></p>
The about="#mwt1" attribute (which indicates that this element is generated by a template) is missing on some of the elements. This causes VE to treat them as if they were normal text rather than part of the template, which in turn causes them to be duplicated when saving.
Ya, using wikilinks in extlinks can cause this - work has begun on a new linter category T202905: Outreach-17 Project: Add a new Linter Category: Links-in-Links to identify these usages and have editors fix them.
But, Parsoid tries to avoid dirty diffs for them, but looks like in this case where a template is causing this, it appears that Parsoid is not doing a good enough job. We'll investigate this issue.
This may have been fixed since then. https://www.mediawiki.org/w/index.php?title=User:SSastry_(WMF)/Sandbox&type=revision&diff=3847352&oldid=3847351&diffmode=source I tried hard to break this locally on the commandline and couldn't -- so presumably this is no longer an issue and in any case, we also have the linter category to flag this to editors. So, I am going to close this and if you find a reproducible case, feel free to reopen.