Page MenuHomePhabricator

Internal links pointing to interwikis are not encoded at all
Open, MediumPublic

Description

An internal link inside flow, that points to an interwiki (that is, not on the wiki itself but another that can be referenced using internal link syntax with the corresponding prefix), it's not encoded at all.

Example:

[[w:en:Wikipedia:FAQ#How do I change my username/delete my account?]]

Points to:

https://en.wikipedia.org/wiki/Wikipedia:FAQ#How do I change my username/delete my account?

But it should point to:

https://en.wikipedia.org/wiki/Wikipedia:FAQ#How_do_I_change_my_username.2Fdelete_my_account.3F

If the link points to the local wiki, the anchor is encoded properly. See Topic:Tesyta6rspc1pysm for a test.

Note also that the link is rendered as class="external" instead of class="extiw" as it would otherwise on normal wiki pages. Normal wiki pages render this correctly.


Another interesting case:

[[w:Wikipedia:Sandbox?action=edit]]

Points to

https://en.wikipedia.org/wiki/Wikipedia:Sandbox?action=edit

But when editing the message, the edit text is now:

[w:Wikipedia:Sandbox?action=edit w:Wikipedia:Sandbox?action=edit]

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Ciencia_Al_Poder renamed this task from The anchor of internal links for interwikis are not encoded to Internal links pointing to interwikis are not encoded at all.Nov 8 2016, 1:41 AM
Ciencia_Al_Poder updated the task description. (Show Details)

I am surprised how the "Internal: [[Manual:Errors and symptoms#Image Thumbnails not working and/or appearing]]" link got encoded properly. But, https://gerrit.wikimedia.org/r/#/c/320929/ starts generated section anchor ids identical to what core parser does with the right encoding. We now need to apply it to links that point to sections. I'll take a look once that other patch is reviewed.

ssastry triaged this task as Medium priority.

Change 324564 had a related patch set uploaded (by Arlolra):
Munge link fragments and element ids as in the php parser

https://gerrit.wikimedia.org/r/324564

Change 324564 merged by jenkins-bot:
Munge link fragments and element ids as in the php parser

https://gerrit.wikimedia.org/r/324564

If/when all is done on the Parsoid's side, please edit or file a report against Flow to address the issue originally reported in T94949: Interwiki links to other MediaWiki wikis in the same cluster don't encode section fragment, so that the imported threads get fixed.

Current output for echo "[[w:en:Wikipedia:FAQ#How do I change my username/delete my account?]]" | node bin/parse --prefix frwiki is,

<a rel="mw:ExtLink" href="https://en.wikipedia.org/wiki/en:Wikipedia:FAQ#How_do_I_change_my_username.2Fdelete_my_account.3F" title="w:en:Wikipedia:FAQ" data-parsoid='{"stx":"simple","a":{"href":"https://en.wikipedia.org/wiki/en:Wikipedia:FAQ#How_do_I_change_my_username.2Fdelete_my_account.3F"},"sa":{"href":"w:en:Wikipedia:FAQ#How do I change my username/delete my account?"},"isIW":true}'>w:en:Wikipedia:FAQ#How do I change my username/delete my account?</a>

ie. https://en.wikipedia.org/wiki/en:Wikipedia:FAQ#How_do_I_change_my_username.2Fdelete_my_account.3F

Note also that the link is rendered as class="external" instead of class="extiw" as it would otherwise on normal wiki pages. Normal wiki pages render this correctly.

Parsoid doesn't add these classes, so I'm guessing that's a Flow issue? It does have mw:ExtLink and "isIW":true though.

Another interesting case

echo "[[w:Wikipedia:Sandbox?action=edit]]" | node bin/parse --prefix frwiki --wt2wt

[w:Wikipedia:Sandbox?action=edit w:Wikipedia:Sandbox?action=edit]

but

echo "[[w:Wikipedia:Sandbox?action=edit]]" | node bin/parse --wt2wt

[[w:Wikipedia:Sandbox?action=edit]]

Probably related to T102556 or https://gerrit.wikimedia.org/r/#/c/293799/

@ssastry wanna take a look?

Note also that the link is rendered as class="external" instead of class="extiw" as it would otherwise on normal wiki pages. Normal wiki pages render this correctly.

Parsoid doesn't add these classes, so I'm guessing that's a Flow issue? It does have mw:ExtLink and "isIW":true though.

Flow adds class="external" when it postprocesses the Parsoid output. See https://phabricator.wikimedia.org/diffusion/EFLW/browse/master/includes/Parsoid/Fixer/ExtLinkFixer.php

@Catrope Thanks for the pointer. I guess this is a dupe of T97093 then.

@Catrope Thanks for the pointer. I guess this is a dupe of T97093 then.

Is it? I mean this task is also about serialization/round-trip issues.

Is it? I mean this task is also about serialization/round-trip issues.

Sorry, I didn't mean the entire task, just the specific part we're discussing (ie. the link is rendered as class="external" instead of class="extiw").

Is it? I mean this task is also about serialization/round-trip issues.

Sorry, I didn't mean the entire task, just the specific part we're discussing (ie. the link is rendered as class="external" instead of class="extiw").

Ah, I see. Yes, partly, although the other part of it is that the specific class name is needed for MW-specific CSS to kick in.

Arlolra added a subscriber: cscott.

Nowadays,

> echo "[[w:en:Wikipedia:FAQ#How do I change my username/delete my account?]]" | php bin/parse.php --domain fr.wikipedia.org
<p data-parsoid='{"dsr":[0,69,0,0]}'><a rel="mw:WikiLink/Interwiki" href="https://en.wikipedia.org/wiki/en:Wikipedia:FAQ#How_do_I_change_my_username/delete_my_account?" title="w:en:Wikipedia:FAQ" data-parsoid='{"stx":"simple","a":{"href":"https://en.wikipedia.org/wiki/en:Wikipedia:FAQ#How_do_I_change_my_username/delete_my_account?"},"sa":{"href":"w:en:Wikipedia:FAQ#How do I change my username/delete my account?"},"isIW":true,"dsr":[0,69,2,2]}'>w:en:Wikipedia:FAQ#How do I change my username/delete my account?</a></p>

> echo "[[w:en:Wikipedia:FAQ#How do I change my username/delete my account?]]" | php bin/parse.php --domain fr.wikipedia.org --wt2wt
[w:en:Wikipedia:FAQ#How do I change my username/delete my account? w:en:Wikipedia:FAQ#How do I change my username/delete my account?]

so roundtripping isn't doing so great.

The other case,

> echo "[[w:Wikipedia:Sandbox?action=edit]]" | php bin/parse.php --domain fr.wikipedia.org
<p data-parsoid='{"dsr":[0,35,0,0]}'><a rel="mw:WikiLink/Interwiki" href="https://en.wikipedia.org/wiki/Wikipedia:Sandbox%3Faction=edit" title="w:Wikipedia:Sandbox?action=edit" data-parsoid='{"stx":"simple","a":{"href":"https://en.wikipedia.org/wiki/Wikipedia:Sandbox%3Faction=edit"},"sa":{"href":"w:Wikipedia:Sandbox?action=edit"},"isIW":true,"dsr":[0,35,2,2]}'>w:Wikipedia:Sandbox?action=edit</a></p>

> echo "[[w:Wikipedia:Sandbox?action=edit]]" | php bin/parse.php --domain fr.wikipedia.org --wt2wt
[[w:Wikipedia:Sandbox?action=edit]]

seems ok.