Page MenuHomePhabricator

Add custom uppercase collation for Inari Sámi Wikipedia
Closed, ResolvedPublicFeature

Description

I am at an event with most of the active contributors to the Inari Sámi Wikipedia (saying this to establish that consensus is there), and they asked if it's possible to make categories sort in the proper Inari Sámi alphabetical order.

Inari Sámi is already in ICU, but it is missing ordering of two characters, Ä and Á (verify here), so until that is fixed upstream, we should add a custom uppercase collation instead.

Details

Related Changes in Gerrit:

Event Timeline

jhsoby-WMNO renamed this task from Use (adapted) ICU collation for Inari Sámi Wikipedia to Add custom uppercase collation for Inari Sámi Wikipedia.Mar 15 2025, 4:28 PM
jhsoby-WMNO updated the task description. (Show Details)

Change #1128008 had a related patch set uploaded (by Jon Harald Søby; author: Jon Harald Søby):

[mediawiki/core@master] Add uppercase collation for Inari Sámi

https://gerrit.wikimedia.org/r/1128008

Huh, weirdly they didn't just forget those characters but icu intentionally put them in the wrong place at the end. https://github.com/unicode-org/cldr/blob/main/common/collation/smn.xml#L27

Oh, that is weird indeed! It seems like their collation is based on the order that was given in the English Wikipedia article at the time it was added, so it's probably technically our fault. 😅

I've filed an upstream issue to fix that, and added a second task as parent task to this one as a reminder to switch when the fix is added, so this task can be about the temporary solution of using a custom uppercase collation.

Change #1128008 merged by jenkins-bot:

[mediawiki/core@master] Add uppercase collation for Inari Sámi

https://gerrit.wikimedia.org/r/1128008

@jhsoby Is there anything remaining in this task before it can be closed?