Page MenuHomePhabricator

Set $wgCategoryCollation to 'uca-hsb' on Upper Sorbian Wikipedia (hsb.wp) and rebuild category sort keys
Closed, ResolvedPublic


Consensus of all three users ;)

Event Timeline

J_budissin assigned this task to Glaisher.
J_budissin raised the priority of this task from to Medium.
J_budissin updated the task description. (Show Details)
J_budissin added a subscriber: J_budissin.

A link to the discussion would still be helpful, in case this ever comes up in the future. Thanks :-)

Se4598 removed a project: Patch-For-Review.

Change 192803 had a related patch set uploaded (by Odder):
Set $wgCategoryCollation to 'uca-hsb' on hsbwiki

The patch is currently on hold waiting for someone to provide us with a link to some sort of community consensus for this change.

Scheduled for deployment during today's morning SWAT window, ie. at around 16:00 UTC.

Change 192803 merged by jenkins-bot:
Set $wgCategoryCollation to 'uca-hsb' on hsbwiki

tomasz added a subscriber: Krenair.

This has now been deployed on the production cluster by @Krenair.

How long will it take to rebuild the category sort keys?

I checked one random category, and the keys seem to have been rebuilt already. Let us know if that is not the case :-)

How long will it take to rebuild the category sort keys?

The maintenance script to rebuild the category sort keys was run shortly after deploying the above patch. Is it still not appearing correctly?

It is working differently, but not correctly (it sorts Ł as simple L, Š as simple S and so forth). That should not be the case. Is it possible, that the Unicode-sorting scheme is incorrect? It should look like A, B, C, Č, Ć, D, Dź, E, F, G, H, Ch, I, J, K, Ł, L...

If it is not working as expected, I think you should file a new task for that. @matmarex might be able to help, I think. ;)

Yes, it seems that the ICU library, which MediaWiki uses for collation, doesn't actually support Upper Sorbian in the version used on WMF wikis. Whoops :( I've been able to reproduce the incorrect behavior locally using ICU 52 (the one that comes with Ubuntu 14).

Judging by the dates on and, it will probably be supported in ICU 54. WMF will surely upgrade to that one at some point, but no idea when that can happen.

Hm. I didn't know about that. Would it be theoretically possible to change at least the collation of letters manually to the right way or is that too much to ask for?

It's sadly impossible, as far as I know. We use ICU via the Collator class in PHP, and while ICU supports custom collations, PHP's Collator doesn't.

Upgrade to libicu52 is

Ubuntu 15.10 only has libicu52, so a backport of libicu54 although not impossible, might be more complex. However, doing it once now might be better than 52, and then 54 in the near future?