Page MenuHomePhabricator

Parsoid automatically changes pseudo-interlanguage interwiki link text to match link prefix + full target page name
Open, NormalPublic

Description

See m:Special:Diff/11792372 (didn't attempt to change or remove any links). Meta does not support "In Wikipedia" sidebar interlanguage links and displays [[en:[…]|[…]]] as text instead, which apparently causes problems in the VisualEditor. (T71822 is maybe related.)

Event Timeline

FDMS created this task.Apr 13 2015, 6:52 PM
FDMS raised the priority of this task from to Needs Triage.
FDMS updated the task description. (Show Details)
FDMS added a project: VisualEditor.
FDMS added a subscriber: FDMS.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 13 2015, 6:52 PM
FDMS renamed this task from VisualEditor automatically changes interwiki link text to match link prefix + full target page name to VisualEditor automatically changes pseudo-interlanguage interwiki link text to match link prefix + full target page name.Apr 13 2015, 6:59 PM
FDMS updated the task description. (Show Details)
FDMS set Security to None.
Jdforrester-WMF renamed this task from VisualEditor automatically changes pseudo-interlanguage interwiki link text to match link prefix + full target page name to Parsoid automatically changes pseudo-interlanguage interwiki link text to match link prefix + full target page name.Apr 17 2015, 4:49 PM
Jdforrester-WMF edited projects, added Parsoid; removed VisualEditor.
ssastry triaged this task as Normal priority.Apr 17 2015, 5:22 PM
ssastry assigned this task to cscott.Apr 17 2015, 5:54 PM
ssastry moved this task from Backlog to In Progress on the Parsoid board.
ssastry moved this task from In Progress to Backlog on the Parsoid board.Dec 17 2015, 5:40 PM
cscott reassigned this task from cscott to Sbailey.Dec 1 2017, 9:29 PM
cscott added a subscriber: cscott.

Passing the torch on this one. Interlanguage links are mostly deprecated at this point, so I doubt this is a high-priority bug.

cscott added a comment.EditedDec 19 2017, 6:30 PM

Context: https://meta.wikimedia.org/wiki/Help:Interwiki_linking#Interlanguage_links

For projects like Meta, a missing leading colon has no effect, because Meta doesn't support interlanguage links. For Wikipedia and similar projects, it is a major difference.

This might have already been fixed? This output seems reasonable:

$  echo '[[en:WP:WikiLove|WikiLove]]' | bin/parse.js --wt2html --domain de.wikipedia.org --normalize=parsoid
<link rel="mw:PageProp/Language" href="https://en.wikipedia.org/wiki/WP:WikiLove"/>
$ echo '[[en:WP:WikiLove|WikiLove]]' | bin/parse.js --wt2html --domain meta.wikimedia.org --normalize=parsoid
<p><a rel="mw:WikiLink/Interwiki" href="https://en.wikipedia.org/wiki/WP:WikiLove" title="en:WP:WikiLove">WikiLove</a></p>

And in the html2wt direction:

$ echo '[[en:WP:WikiLove|WikiLove]]' | bin/parse.js --wt2html --domain de.wikipedia.org --normalize=parsoid | bin/parse.js --html2wt --domain de.wikipedia.org
[[en:WP:WikiLove]]
$ echo '[[en:WP:WikiLove|WikiLove]]' | bin/parse.js --wt2html --domain meta.wikimedia.org --normalize=parsoid | bin/parse.js --html2wt --domain meta.wikimedia.org
[[:en:WP:WikiLove|WikiLove]]

(Note that you have to test on de.wikipedia.org not en.wikipedia.org, because en is not an interlanguage link when it points to the wiki itself.)

The first output there (on de.wikipedia.org) seems like a bug since [[en:WP:WikiLove]] seems like it would display en:WP:WikiLove part in the link text instead of just WikiLove -- but in fact it's an interlanguage link so the whole thing is invisible and the link text doesn't matter; see https://de.wikipedia.org/w/index.php?title=Benutzer:Cscott/task95931&oldid=172115487

You could also argue that the leading colon in the --html2wt direction on meta.wikimedia.org isn't strictly needed, since meta does not have interlanguage links enabled. But it round-trips fine without the colon if you include the data-parsoid attribute:

$ echo '[[en:WP:WikiLove|WikiLove]]' | bin/parse.js --wt2wt --domain meta.wikimedia.org
[[en:WP:WikiLove|WikiLove]]

...although this is a little strange (and probably the originally reported bug):

$ echo '[[en:WP:WikiLove|WikiLove]]' | bin/parse.js --wt2wt --domain de.wikipedia.org
[[en:WP:WikiLove|en:WP:WikiLove]]

But again, the en:WP:WikiLove on dewiki is ultimately invisible, so this bug doesn't have any practical effect any more (probably because we fixed the metawiki case).

You also get interesting results if you try to serialize an interlanguage link on metawiki (which doesn't support them):

$ echo '[[en:WP:WikiLove|WikiLove]]' | bin/parse.js --wt2html --domain de.wikipedia.org | bin/parse.js --html2wt --domain meta.wikimedia.org
[[en:WP:WikiLove|en:WP:WikiLove]]

But again that's mildly expected, since there's no way to actually write an interlanguage link on a wiki which doesn't support them.

So my conclusion from the investigation is that we were once-upon-a-time emitting an interlanguage link (<link rel="mw:PageProp/Language" href="https://en.wikipedia.org/wiki/WP:WikiLove"/>) instead of an interwiki link (<a rel="mw:WikiLink/Interwiki" href="https://en.wikipedia.org/wiki/WP:WikiLove" title="en:WP:WikiLove">WikiLove</a>) on wikis where language links were not supported (like metawiki). This bug seems to have been corrected since this task was filed.

We still have an odd asymmetry where interlanguage links are serialized with vertical bar text ([[en:WP:WikiLove|en:WP:WikiLove]]) which should never really happen. Interlanguage links can't have link text. We should always be emitting [[en:WP:WikiLove]] (with no link text) for interlanguage links.

This buglet doesn't have any visible effect---since interlanguage links are rendered as invisible anyway---but it might be worth addressing for internal consistency at least.