Page MenuHomePhabricator

Full stop in link target causes link to not be parsed, rendered as wikitext
Closed, ResolvedPublic

Description

Without the full stop it works fine:

[//foo.org/bar#baz bang]

->

<p><a href="//foo.org/bar#baz">bang</a></p>

… but:

[//foo.org/bar#baz. bang]

->

<p>[//foo.org/bar#baz. bang]</p>

Version: unspecified
Severity: normal

Details

Reference
bz63947

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:25 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz63947.

Also:

[//foo.org/bar bang]

->

<p><a href="//foo.org/bar">bang</a></p>

… but:

[//foo.org/bar. bang]

->

<p>[//foo.org/bar. bang]</p>

The tokenizer parses the link fine, but the LinkHandler seems to think it's invalid & converts it back to text.

The issue is that the url production in the tokenizer is used both for urllinks and the validation of general links via tokenizeURL. URL links are supposed to avoid eating trailing commas, colons and stops. In other hrefs those are fine though.

We should probably split the urllink production from the more general url production.

A single quote produces the same issues as a period.

Change 126853 had a related patch set uploaded by Cscott:
WIP: Fix a number of link-parsing and serialization issues.

https://gerrit.wikimedia.org/r/126853

Change 126853 merged by jenkins-bot:
Fix a number of link-parsing and serialization issues.

https://gerrit.wikimedia.org/r/126853