Page MenuHomePhabricator

mw.ustring.lower doesn't affect hypogegrammene
Closed, DuplicatePublic

Description

I don't know what other characters or text functions this affects, but, well:

mw.ustring.lower("ΑΆἈἌἎἉἍἏᾼᾈᾌᾎᾉᾍᾏ")

αάἀἄἆἁἅἇᾼᾈᾌᾎᾉᾍᾏ

mw.ustring.upper("αάἀἄἆἁἅἇᾳᾀᾄᾆᾁᾅᾇ")

ΑΆἈἌἎἉἍἏᾼᾈᾌᾎᾉᾍᾏ

Event Timeline

ObsequiousNewt raised the priority of this task from to Needs Triage.
ObsequiousNewt updated the task description. (Show Details)
ObsequiousNewt added a subscriber: ObsequiousNewt.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 18 2015, 12:14 AM
Aklapper triaged this task as Low priority.Mar 18 2015, 10:56 AM
Anomie added a subscriber: Anomie.

The behavior of Scribunto's ustring upper and lower methods depends on the behavior of PHP's mb_strtoupper and mb_strtolower, which also exhibit this behavior.

The problem seems to be that PHP's mb_strtolower() ignores any character that doesn't have the "uppercase" Unicode property, and these characters are flagged as "titlecase". Whether that's the correct behavior for mb_strtolower() or whether it should be checking for "uppercase or titlecase", I have no idea. But even if it is incorrect, it's not something that we're going to be able to fix here. You'll need to take it to https://bugs.php.net/; please comment here with the upstream bug number once you find/create it.

Anomie changed the task status from Open to Stalled.Mar 18 2015, 4:18 PM
Aklapper lowered the priority of this task from Low to Lowest.Mar 20 2015, 11:30 AM
Krenair moved this task from Backlog to Reported Upstream on the Upstream board.Mar 29 2015, 8:17 PM

This is in the process of being fixed by T176370: Migrate to PHP 7 in WMF production. I'm going to close this as a duplicate of that task.