Page MenuHomePhabricator

Parsoid auto-links news: URL when it shouldn't, and eats end-of-bold triple quotes
Closed, ResolvedPublic

Description

$ echo "'''News:''' Stuff here" | node tests/parse.js --normalize

<p><b><a href="News:'''">News:'''</a> Stuff here</b></p>

The PHP parser produces <p><b>News:</b> Stuff here</p>.

Originally reported by @gpaumier when trying to edit https://office.wikimedia.org/wiki/User:Guillaume/Goals#Retrospective in VE.

Event Timeline

Catrope raised the priority of this task from to Needs Triage.
Catrope updated the task description. (Show Details)
Catrope added projects: Parsoid, Parsoid-DOM.
Catrope added subscribers: Catrope, gpaumier.
Catrope renamed this task from Parsoid auto-links news: URLs in cases where the PHP parser doesn't (involving triple quotes) to Parsoid auto-links news: URL when it shouldn't, and eats end-of-bold triple quotes.Sep 24 2015, 9:34 PM
Catrope updated the task description. (Show Details)
Catrope set Security to None.

Also note that this causes round-trip failures, both in wt2wt and html2html:

$ echo "'''News:''' Stuff here" | node tests/parse.js --wt2wt
'''News:''' Stuff here'''

$ echo "<p><b>News:</b> Stuff here</p>" | node tests/parse.js --html2html --normalize
<p><b><a href="News:'''">News:'''</a> Stuff here</b></p>
cscott subscribed.

Hm, this should be falling into the (recently-added) "autolinks must have at least once character after protocol" test. But it's not -- presumably because the quotes are being sucked into the link. Probably a tokenizer precedence issue, gah. I'll take a look.

Change 240915 had a related patch set uploaded (by Cscott):
Double or triple quotes terminate autolinks

https://gerrit.wikimedia.org/r/240915

Change 240915 merged by jenkins-bot:
Double or triple quotes terminate autolinks

https://gerrit.wikimedia.org/r/240915

Arlolra subscribed.

Seems fixed by the above,

λ (master) echo "'''News:''' Stuff here" | node bin/parse.js --normalize

<p><b>News:</b> Stuff here</p>

λ (master) echo "'''News:''' Stuff here" | node bin/parse.js --wt2wt
'''News:''' Stuff here

λ (master) echo "<p><b>News:</b> Stuff here</p>" | node bin/parse.js --html2html --normalize

<p><b>News:</b> Stuff here</p>