Page MenuHomePhabricator

Magically strip line feeds from page titles when they're a mistake, but don't when they're not a mistake, when importing citation using Citoid
Closed, DuplicatePublic

Description

In Visual Editor 'Add a citation tool' if the title of the URL has a | in it is causes an error message to show up when it is automatically generated.

E,g

http://www.unesco.org/new/en/natural-sciences/science-technology/single-view-sc-policy/news/regional_economic_communities_a_conduit_for_southsouth_co

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Esanders subscribed.

The URL doesn't have a pipe in it but I do get an error:

pasted_file (244×404 px, 29 KB)

Yes, sorry, that's what I meant, thanks

Deskana subscribed.

The problem here is the site in question, not Citoid or VisualEditor. The page's title really does have a line feed character in position 79, and the template in question has been written to return this error message if that happens. It wouldn't be correct for VisualEditor to not display this message. I guess Citoid could strip characters like line feeds from a page's title, but that's a dangerous path to start down in case things are stripped that shouldn't be.

It's actually more obvious what's wrong when you're looking at it in VisualEditor than the wikitext editor. If you try to change the title of the reference in VisualEditor then you can quite clearly see the line feed character, which is not visible at all in a wikitext editor (see screenshot 1). If you delete that line feed, everything is fine (see screenshot 2).

The best course of action would be to contact the site and telling them they really shouldn't be putting things like line feeds in their page titles.

Screen Shot 2017-08-25 at 14.49.51.png (246×840 px, 33 KB)

Screen Shot 2017-08-25 at 14.50.00.png (368×864 px, 91 KB)

Deskana renamed this task from In Visual Editor 'Add a citation tool' if the title of the URL has a | in it is causes an error message to show up to Consider stripping characters like line feeds from page titles when importing citation using Citoid.Aug 25 2017, 2:09 PM
Deskana triaged this task as Low priority.

Thanks for the explanation @Deskana, it would be interesting to know how widely this is done by websites or if this is a weird problem caused by funny URLs from a couple of sites

It's very rare. This is the first time I've ever seen a web page that has a line feed in its title. Generating hard statistics on this is difficult unless you want to use or write a web crawler to do it.

Thanks, well I think if its the first time you've seen it it must be pretty rare.

Deskana lowered the priority of this task from Low to Lowest.Aug 21 2018, 1:12 PM

Indiscriminately stripping things doesn't seem like a good path to go down, and the team has no plans to investigate other solutions at this time.

Deskana renamed this task from Consider stripping characters like line feeds from page titles when importing citation using Citoid to Magically strip line feeds from page titles when they're a mistake, but don't when they're not a mistake, when importing citation using Citoid.Aug 21 2018, 1:12 PM
Deskana moved this task from External and Administrivia to Freezer on the VisualEditor board.