Page MenuHomePhabricator

  should terminate a free external link
Closed, ResolvedPublic


The EXT_LINK_URL_CLASS regexp in Parser.php allows the Zs unicode class to delimit autolinked URLs. This includes all unicode "separator, space" character, including non-breaking space (aka \u00A0).

However, it is very common to represent non-breaking space in   in wikitext, as it is hard to type the unicode character directly. But   doesn't delimit a URL: this is my website

parses as the url this.

Event Timeline

cscott raised the priority of this task from to Needs Triage.
cscott updated the task description. (Show Details)
cscott added a project: MediaWiki-Parser.
cscott changed Security from none to None.
cscott subscribed.

Change 180982 had a related patch set uploaded (by Cscott):
Terminate free external link on &nbsp; (and numeric versions of <>)


Aklapper triaged this task as Medium priority.Dec 19 2014, 4:02 PM

My concern here is mostly relating to VE. I was worried that VE would generate &nbsp; in the wikitext if the user manually inserted a non-breaking space, which would then require addition of <nowiki/> to separate it from the URL. If it generates \u00A0 instead, then it looks better (no <nowiki/>) but someone editing in "source mode" can't tell that it's a non-breaking space at all (not so good).

So I think it's better if VE generates &nbsp; and we make the treatment of &nbsp; and \u00A0 consistent in the core parser (and in parsoid).

Change 240568 had a related patch set uploaded (by Cscott):
Terminate autolinks on &nbsp; and numeric entity encodings of <>

Change 180982 merged by jenkins-bot:
Terminate free external link on &nbsp; (and numeric versions of <>)

Change 240568 merged by jenkins-bot:
Terminate autolinks on &nbsp; and numeric entity encodings of <>

ssastry claimed this task.