Page MenuHomePhabricator

Add link trail on Slavic "ů" character
Closed, ResolvedPublic

Description

Author: mike

Description:
At the moment, Slavic diacritic "ů" is making link trails end.

I.e. "[[śilńik]]ům" will produce clickable "śilńik" with nor linked "ům" ending.

Other Slavic diacritics are used in Czech language and are working fine, the problem is only made by this ů letter.


Version: unspecified
Severity: enhancement
URL: http://szl.wikipedia.org/w/index.php?title=%C5%A0trasbana&diff=17575&oldid=17571

Details

Reference
bz14512

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:14 PM
bzimport set Reference to bz14512.
bzimport added a subscriber: Unknown Object (MLST).

I suggest adding this to languages/messages/MessagesSzl.php

$linkTrail = '/^([a-zů]+)(.*)$/sDu';

However, I'm not applying it myself, because I think it may be retouched (like, other characters be added too).

Amended in r36250.

I've fixed the base default to accept all unicode alpha characters. Rather than specifying a few characters for individual languages, this should mean that all characters work in all languages. (Pending deletion of legacy $linkTrail definitions from some other locale files.

Oh right... You can see this over at:
http://dev.wiki-tools.com/purge/Link_Codes

I have a demoing setup to set the content language just use:
http://langcode.dev.wiki-tools.com/purge/Link_Codes

So for this locale here:
http://szl.dev.wiki-tools.com/purge/Link_Codes

Ah crap... Sorry bad paste... My amended revision is r36253 and r36254.

Actually, Daniel, I'm not sure if what you did has no side effects; I can recall an old discussion with Brion, where he told me about side effects of having all these linktrail stuff handled by En, so I'm adding him to the CC list. He knows better than I do.

Well, a conversation in irc between Me, Splarka, and Tim yielded the view that it's best if linktrail is done in a locale independent manor (apparently word diff already is), sans the few language exceptions which would be overridden per-language if they had some sort of unlikely fatal error.

I've been hunting through the various languages. 90% of the characters added in $linkTrail overrides is covered by the default. Though, likely I'm going to have to put some of that inside of a constant or two and use that in a method of creating an override for a few languages that use things like » inside of their linktrail.

The main trick with locale-independent linktrails is that some languages don't use word spacing (or don't use it consistently, or don't use it the way we do), meaning that they shouldn't have trails extended. Luckily they're usually in their own writing systems -- Chinese, Thai, and such -- allowing us to treat them as such in a mixed-language environment.

Considering mixed-language text, we need to make sure that trails don't extend unexpectedly when a link abuts text in another language.

The other oddity is that a couple languages currently specify some quote characters for their linkprefix and linkTrail (such as cv), so a link that appears in quotes will expand to include them. I'm not sure how proper that is to do.