Page MenuHomePhabricator

References incorrectly nested in an internal link are lost during publishing
Open, Needs TriagePublicBUG REPORT

Description

What happens?: After hitting "Publish", several/all of the references on the page are deleted. This can be found occurring when an editor has placed a reference inside of a wikilink (e.g., [[dogs<ref=123/>]], such as can be seen on a historic version of the FIM Long Track Youth World Cup page.

Steps to replicate the issue (include links if applicable):

  1. Using Visual Editor, remove the wikilink but do not do anything with the in-line citation. After removing the wikilink, the footnote should still be there.
  2. Publish the page.
  3. At this point, the reference near the wikilink may have disappeared.

What should have happened instead?: References should remain on the page and/or not be visible before publishing if they have been deleted.

Event Timeline

matmarex renamed this task from References deleted during publishing to References incorrectly nested in an internal link are lost during publishing.Dec 4 2023, 6:54 PM
matmarex added a project: Parsoid.
matmarex subscribed.

It's not only removing the wikilink that causes the issue, but any change to the same table cell (like adding text or changing link label).

Minimal test case is just [[a|b<ref>c</ref>]]: https://en.wikipedia.beta.wmflabs.org/wiki/T352624 – just about any edit to that page causes the <ref> to be lost.

(on that page, I also see an exposed strip marker in the old parser, which isn't the case in the original Wikipedia example – interesting)

$  echo "[[a|b<ref>c</ref>]]" | php bin/parse.php --wt2wt
[[a|b]]<ref>c</ref>

<references />

But yes, this is just a fallout of trying to embed A-tags (introduced by the reference) inside A-tags (introduced by the wikilink). Parsoid has some code in place to detect link-in-link scenarios but that analysis runs before ref-tags are processed. So, unless this and T289491 are *not* edge cases, my preference is to just have T303409 flag them as broken wikitext that should be fixed up instead of trying to add complexity inside Parsoid to detect and recover from them.