From visualdiff testing:
- Pages
Screenshots:
- Legacy:
- Parsoid:
- Diff
From a quick look this comes from the header template lua:
https://fr.wikisource.org/wiki/Module:Header_template#L-56
| Jgiannelos | |
| Oct 31 2025, 12:41 PM |
| F68830400: image.png | |
| Oct 31 2025, 12:41 PM |
| F68829943: image.png | |
| Oct 31 2025, 12:41 PM |
| F68829863: image.png | |
| Oct 31 2025, 12:41 PM |
From visualdiff testing:
Screenshots:
From what I understand parsoid is escaping <link> elements as expected. Those links are injected from Module:Header_template lua script to add schema.org semantics.
I think ideally the microdata shouldn't be injected as hidden links but instead being added in existing elements <div>/<span> elements.
Disagreeing here, most websites do include this data in link tags, but even besides that, there is fair bit of scripts that rely on these tags being around to find wikidata IDs and similar, updating everything to use <div> and <span> tags is going to be a much more significant undertaking.
Change #1215585 had a related patch set uploaded (by Jgiannelos; author: Jgiannelos):
[mediawiki/services/parsoid@master] Implement sanitization of meta/link tags with schema.org microdata
@Jgiannelos are there any blockers on this? This affects a reader-facing component of Wikisource (the book download functionality), and having it broken for multiple weeks is less than ideal
Change #1225613 had a related patch set uploaded (by Jgiannelos; author: Jgiannelos):
[operations/mediawiki-config@master] ProofreadPage: Disable flag to render using parsoid temporarily
Change #1225613 merged by jenkins-bot:
[operations/mediawiki-config@master] ProofreadPage: Disable flag to render using parsoid temporarily
Mentioned in SAL (#wikimedia-operations) [2026-01-13T14:08:23Z] <zabe@deploy2002> Started scap sync-world: Backport for [[gerrit:1225613|ProofreadPage: Disable flag to render using parsoid temporarily (T408915)]]
Mentioned in SAL (#wikimedia-operations) [2026-01-13T14:12:42Z] <zabe@deploy2002> jgiannelos, zabe: Backport for [[gerrit:1225613|ProofreadPage: Disable flag to render using parsoid temporarily (T408915)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
Mentioned in SAL (#wikimedia-operations) [2026-01-13T14:22:07Z] <zabe@deploy2002> Finished scap sync-world: Backport for [[gerrit:1225613|ProofreadPage: Disable flag to render using parsoid temporarily (T408915)]] (duration: 13m 44s)
https://en.wikipedia.org/w/index.php?title=User:Cscott/T408915&useparsoid=0 vs https://en.wikipedia.org/w/index.php?title=User:Cscott/T408915&useparsoid=1 is a good test case.
Parsoid never allows <link> or <meta> as wikitext; the core sanitizer does, but only in certain cases.
See also T78055: Add parsertests for Parsoid <meta>/<link> tags. (which may explain why it is hard to write parser tests for this), and T25932: Allow use of semantic HTML5 elements in wikitext (which seems to credit Parsoid with handling <meta>/<link> correctly, when we apparently don't).
Apparently Parsoid escapes LINK tags sometimes, and sometimes it doesn't. Compare two very similar books:
I think this is a matter of caching. After https://phabricator.wikimedia.org/T408915#11516718 i reverted the flag that we use parsoid on proofreadpage rendering until we have a proper solution. I purged the page and it renders links properly now.