Page MenuHomePhabricator

Rendering diff on broken link with template (visual diff testing)
Open, MediumPublicBUG REPORT

Description

Rendering diff between

https://en.wikivoyage.org/w/index.php?title=Cala_Millor&useparsoid=0
https://en.wikivoyage.org/w/index.php?title=Cala_Millor&useparsoid=1

Legacy

image.png (129×522 px, 22 KB)

Parsoid

image.png (135×524 px, 25 KB)

The corresponding wikitext is broken (missing a closing ] ).

[https://www.tib.org{{Dead link|date=January 2023 |bot=InternetArchiveBot }} TIB Line 412 runs to Cala Millor from [[Palma de Mallorca]] (several times a day, €9.80), and Manacor (€3.10); and from Porto Cristo, S'Illot, Sa Coma, Son Servera, and Costa dels Pins (€1.85). {{phone|+34 97 1177 777}}

Event Timeline

(in See, the text below Notodden church). Rendering is broken there anyway.

Same issue, yeah, but fixed now,
https://en.wikivoyage.org/w/index.php?title=Notodden&diff=4902481&oldid=4772239

(Note that this actually renders ok in Parsoid because the entire extlink is in the expanded output)

The corresponding wikitext is broken (missing a closing ] ).

Yes, but even if you were to try to fix it as [https://www.tib.org{{Dead link|date=January 2023 |bot=InternetArchiveBot }} TIB Line 412] you'd still have rendering differences in Parsoid. Parsoid tokenizes the template as part of the href and doesn't recover when it expands url terminating chars. Somewhat similarly for free external links as well.

This is described in T222328#5152953 and probably T207710.

The caveats for the template suggest using it in way that would render the same in Parsoid,
https://en.wikivoyage.org/wiki/Template:Dead_link#Caveats

I imagine the dead link template is popular and maybe we should grep to see how strictly that's followed.

ABreault-WMF triaged this task as Medium priority.
ABreault-WMF moved this task from Backlog to In Progress on the Content-Transform-Team-WIP board.

I don’t think T369997 is the same as this: there, the wikitext isn’t broken IMO (link after link without any separation), while here it’s clearly broken (unclosed link).

I don’t think T369997 is the same as this

From T368724#9957753,

even if you were to try to fix it ... you'd still have rendering differences in Parsoid. Parsoid tokenizes the template as part of the href and doesn't recover when it expands url terminating chars.

Change #1055313 had a related patch set uploaded (by Arlolra; author: Arlolra):

[mediawiki/services/parsoid@master] [WIP] Extlink with templated url terminating chars

https://gerrit.wikimedia.org/r/1055313

I imagine the dead link template is popular and maybe we should grep to see how strictly that's followed.

This is broken wikitext (for the dead link template). It should at a minimum have a space after the URL as with every other expectation of how external links work. Better, it shouldn't be in the link at all in this form, as the template emits a link itself (assuming this is a copy-paste of the English Wikipedia template). If Parsoid can see through the template expansion here, it will/should perhaps register this issue as a link-in-link error. (Or maybe that category wouldn't catch this case....)

So, categorically, the source wikitext is broken. From that perspective, I think I prefer the Parsoid version of the final rendering if we want to ignore the issue here.

The broader issue of [https://example.com{{template}} example text] is still perhaps worth fixing in the meantime. I think it's plausible {{template}} could hold text that should be part of the URL, but I can't think of many such cases outside of pages already intended to be transcluded.

I don’t think T369997 is the same as this

From T368724#9957753,

even if you were to try to fix it ... you'd still have rendering differences in Parsoid. Parsoid tokenizes the template as part of the href and doesn't recover when it expands url terminating chars.

First, this wasn’t in the description, so it’s unclear if it’s in scope. Second,

Yes, but even if you were to try to fix it as [https://www.tib.org{{Dead link|date=January 2023 |bot=InternetArchiveBot }} TIB Line 412] […]

if you fix it that way, it’s still clearly broken (link-in-link). What I described in T369997 is link-after-link, not link-in-link.

It should at a minimum have a space after the URL as with every other expectation of how external links work.

The template starts with a <sup>. I think it’s sensible to say that unescaped HTML/wikitext syntax terminates the URL, and is interpreted as HTML/wikitext. In fact, for free external links, it’s the only way to put something right after the link, without a space in between (which I think looks much better for dead link templates).

The unclosed link in the description means that you have an autourl followed by a tag,

https://www.tib.org{{Dead link|date=January 2023 |bot=InternetArchiveBot }}

which results in the rendering differences. The unclosed syntax is a red herring. My point was that Parsoid tokenizes the template as part of the href and doesn't recover.

The test in T368724#9997424 captures what we want fixed,
https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/1055313/1/tests/parser/extLinks.txt

Hopefully you agree. The tests above that for T2289 are the same non-templated cases.