Page MenuHomePhabricator

Parsoid: Interwiki links are halfway converted to external links, and completely broken
Closed, DuplicatePublic

Description

http://en.wikipedia.beta.wmflabs.org/w/index.php?diff=118689&oldid=101159

This took a perfectly normal link like:

[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]

and turned it into something completely broken (single brackets instead of double, and space instead of pipe):

[mw:User talk:Whatamidoing (WMF) on Mediawiki]


Version: unspecified
Severity: major

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:40 AM
bzimport set Reference to bz69207.

Also, I didn't touch any of those links. I added a space in one paragraph so I could look at the Save dialog, and it broke all of these interwiki links on the entire page.

Ick. Regression?

I cannot reproduce this on http://en.wikipedia.beta.wmflabs.org/w/index.php?title=User:Whatamidoing_(WMF)/Sandbox&oldid=101159 when I try to edit again.

However, I can reproduce it locally on the commandline if a data-parsoid flag is removed.

[subbu@earth lib] echo '[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]' | node parse | sed 's/"isIW":true,//g;' | node parse --html2wt
[mw:User talk:Whatamidoing (WMF) on Mediawiki]

So, it is unclear how this might have happened. Clearly, the original and modified HTML differed on all these links for them to get re-serialized and because of the missing attribute in data-parsoid, the serialization breaks.

Could this have been a transient VE bug that clobbered data-parsoid somehow?

Unrelated to the bug report: we'll probably make our link handling code more robust since we can recover most information without the flag being present. This is part of the general link handling cleanup that needs to happen in Parsoid.

Arlolra set Security to None.

Note that I can still reproduce this on the command line:

vagrant@mediawiki-vagrant:/vagrant/srv/parsoid/src/bin$ echo '[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]' | node parse | sed 's/"isIW":true,//g;' | node parse --html2wt
[mw:User talk:Whatamidoing (WMF) on Mediawiki]

Change 292078 had a related patch set uploaded (by Subramanya Sastry):
WIP: Auto-detect interwiki links without needing data-parsoid info

https://gerrit.wikimedia.org/r/292078

ssastry moved this task from Needs Triage to In Progress on the Parsoid board.Jul 15 2016, 9:29 PM

Change 292078 merged by jenkins-bot:
Auto-detect interwiki links without needing data-parsoid info

https://gerrit.wikimedia.org/r/292078

ssastry closed this task as Resolved.Aug 15 2016, 9:26 PM
ssastry claimed this task.

Note that I can still reproduce this on the command line:

vagrant@mediawiki-vagrant:/vagrant/srv/parsoid/src/bin$ echo '[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]' | node parse | sed 's/"isIW":true,//g;' | node parse --html2wt
[mw:User talk:Whatamidoing (WMF) on Mediawiki]

On latest master now:

[subbu@earth parsoid] echo '[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]' | bin/parse.js | sed 's/"isIW":true,//g;' | bin/parse.js --html2wt
[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]
ssastry reopened this task as Open.Aug 19 2016, 8:51 PM

I had to revert the patch because of T102556#2557410

Re-tested just now; still an issue. Not found in the wild, though, AFAIAA.

ssastry moved this task from In Progress to Needs Triage on the Parsoid board.Sep 11 2017, 7:09 PM
ssastry changed the task status from Open to Stalled.Dec 15 2017, 8:45 PM

Getting answers to T102556#3841672 (which I added just now) is a blocker here.

ssastry changed the task status from Stalled to Open.Dec 15 2017, 9:40 PM
ssastry reassigned this task from ssastry to Sbailey.

Looks like we have a potential solution in T102556#3841768.

I retried James example the current Parsoid and it seems to work properly. Is there another failing case I can look at?

... sbailey$ echo '[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]' | node bin/parse > text

... sbailey$ node bin/parse --inputfile text --html2wt
[[mw:User talk:Whatamidoing (WMF)|on Mediawiki]]

Aklapper edited projects, added Parsoid; removed Parsoid-Edit-Support.Feb 29 2020, 5:15 PM
Aklapper added a subscriber: Aklapper.

@Sbailey: Hi! This task has been assigned to you a while ago. Could you maybe share an update? Do you still plan to work on this task?
If you do not plan to work on this task anymore: Please consider removing yourself as assignee (via Add Action...Assign / Claim in the dropdown menu): That would allow others to work on this (in theory), as others won't think that someone is already working on this. Thanks! :)

Aklapper removed Sbailey as the assignee of this task.Fri, Jun 19, 4:26 PM
Aklapper added a subscriber: Sbailey.

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)