Page MenuHomePhabricator

Special characters not handled properly in properties of datatype date (1.8 alpha)
Closed, ResolvedPublic

Description

On a French wiki, some French translation of months have trouble to be parsed. For instance, "janvier" (january) works well, as do the other months without any special character, but not "février" (february) or "décembre" (december). A small warning sign is displayed on the right of the date.

[[Date de sortie::21 mars 2012]] -> ok

[[Date de sortie::21 février 2011]] -> not ok

[[Date de sortie::21 fév 2011]] (short version) -> not ok

[[Date de sortie::21 août 2012]] -> not ok

This problem was already there in SMW 1.6 but it wasn't in a more distant past.


Version: master
Severity: normal

Details

Reference
bz39342

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 12:48 AM
bzimport set Reference to bz39342.

I have found the culprit:

$matches = preg_split( "/([T]?[0-2]?[0-9]:[\:0-9]+[+\-]?[0-2]?[0-9\:]+|[a-z,A-Z]+|[0-9]+|[ ])/u", $parsevalue , -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY );

in includes/datavalues/SMW_DV_Time.php:202 (SMW 1.8.0.4).

For instance, with the date "8 décembre 1999" (december), $matches is:
Array
(

[0] => 8
[1] =>  
[2] => d
[3] => \xc3\xa9
[4] => cembre
[5] =>  
[6] => 1999

)

It should be something like:
Array
(

[0] => 8
[1] =>  
[2] => d\xc3\xa9cembre
[3] =>  
[4] => 1999

)

I replaced [a-z,A-Z] by [\p{L}] to match any Unicode letter character (see http://www.php.net/manual/en/regexp.reference.unicode.php) and it works fine.

I think it's a harmless change. Any opinion?

That looks good, thanks for reporting, investigating and providing a patch :) I will try it out now.