Parsoid should be able to understand HTML entities in links
Closed, ResolvedPublic1 Story Points

Description

Wikitext: Breakfast and "[http:// http://www.librarieswithoutborders.org/ Librairies without borders]" presentation
Becomes in VE: Breakfast and "[http:// http://www.librarieswithoutborders.org/ Librairies without borders]" presentation as text without any link
But Mediawiki makes of it: Breakfast and "Librairies without borders" presentation linking to http://+http//www.librarieswithoutborders.org/

Found at: https://www.mediawiki.org/w/index.php?title=Wikimedia_Hackathon_2015/Program&oldid=1636908

JanZerebecki added a subscriber: JanZerebecki.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 13 2015, 2:26 PM

Change 223384 had a related patch set uploaded (by Arlolra):
T98960: Accept entities in extlink href

https://gerrit.wikimedia.org/r/223384

Arlolra triaged this task as "Normal" priority.Jul 7 2015, 7:43 PM
Arlolra added a subscriber: Arlolra.
Jdforrester-WMF changed the title from "VE can't handle [http:// http://...]" to "Parsoid should be able to understand HTML entities in links".Aug 6 2015, 11:18 PM
Jdforrester-WMF set Security to None.
Jdforrester-WMF edited a custom field.
Jdforrester-WMF moved this task from To Triage to TR1: Releases on the VisualEditor board.
Arlolra claimed this task.Aug 27 2015, 8:51 PM

Change 223384 had a related patch set uploaded (by Arlolra):
WIP: Accept entities in extlink href

https://gerrit.wikimedia.org/r/223384

ssastry moved this task from Backlog to In Progress on the Parsoid board.Oct 6 2015, 3:25 AM
cscott added a subscriber: cscott.Oct 6 2015, 2:07 PM

This one is weird. That's a totally bogus link, right? It doesn't actually work?

We do a sanity-checking pass after template expansion on the link contents to try to ensure the result actually parses as a URL. I'm guessing embedding a space fails that check -- as well it should.

I'm calling this a bug in the PHP parser for allowing such a thing in the first place.

ssastry moved this task from In Progress to Next Up on the Parsoid board.Dec 17 2015, 5:36 PM
ssastry moved this task from Next Up to Backlog on the Parsoid board.Dec 17 2015, 5:44 PM

Change 223384 abandoned by Arlolra:
WIP: Accept entities in extlink href

Reason:
For now ... until I pick it up again.

https://gerrit.wikimedia.org/r/223384

Neil_P._Quinn_WMF added a comment.EditedJul 13 2016, 12:41 AM

This one is weird. That's a totally bogus link, right? It doesn't actually work?

We do a sanity-checking pass after template expansion on the link contents to try to ensure the result actually parses as a URL. I'm guessing embedding a space fails that check -- as well it should.

I'm calling this a bug in the PHP parser for allowing such a thing in the first place.

@cscott, I don't know about the original test case, but if you look at the duplicate I just created and then merged in, the link I'm using is a valid one that works properly in the read view for its intended function (linking to a prepopulated Phabricator form). Of course, it works just as well if you use percent-encoding in place of character entities, which Parsoid does fine with, but my point is that it can happen with valid links :)

Another example from https://www.mediawiki.org/w/index.php?title=Parsoid/DumpGrepper&oldid=1054779

[https://bugzilla.wikimedia.org/sho w_bug.cgi?id=43652 Bug]

I should pick this up again ...

Change 223384 restored by Arlolra:
WIP: Accept entities in extlink href

https://gerrit.wikimedia.org/r/223384

Elitre added a subscriber: Elitre.EditedJan 9 2017, 6:10 PM

See this diff when an unrelated change changed wikilinks formatting (as reported on mediawiki.org). (Is it related? This is undesirable, especially as it's a bit more difficult to read.)

Change 223384 had a related patch set uploaded (by Arlolra):
T98960: Accept entities in extlink href

https://gerrit.wikimedia.org/r/223384

Change 223384 merged by jenkins-bot:
T98960: Accept entities in extlink href and url links

https://gerrit.wikimedia.org/r/223384

Arlolra closed this task as "Resolved".Jan 19 2017, 5:42 PM

Mentioned in SAL (#wikimedia-operations) [2017-01-31T18:58:46Z] <arlolra> Updated Parsoid to version 734dc996 (T98960)

Jdforrester-WMF changed the point value for this task from 0 to 1.Feb 2 2017, 7:03 PM