Page MenuHomePhabricator

Set $wgCategoryCollation to 'uca-mk-u-kn' on Macedonian wikis and rebuild category sort keys
Closed, ResolvedPublic1 Story Points

Description

Author: misoss

Description:
Pages are wrongly sorted on the Macedonian Wikipedia (as well as all other Macedonian Wiki projects). For instance, see: http://mk.wikipedia.org/wiki/%D0%9A%D0%B0%D1%82%D0%B5%D0%B3%D0%BE%D1%80%D0%B8%D1%98%D0%B0:%D0%9A%D0%BB%D0%B0%D1%81%D0%B8%D1%84%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D1%98%D0%B0_%D0%BD%D0%B0_%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B8%D1%82%D0%B5_%D1%82%D0%B5%D0%BC%D0%B8

Screenshot:

The letter „Ј“ is found at the beginning of the list, instead of after „И“. I believe that this is a database problem, namely in the collation scheme used. For Macedonian, utf8_unicode_ci would be appropriate instead of utf8_general_ci.


Version: unspecified
Severity: enhancement
URL: http://mk.wikipedia.org

Details

Reference
bz24953

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 21 2014, 11:07 PM
bzimport set Reference to bz24953.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

*** This bug has been marked as a duplicate of bug 164 ***

kaldari renamed this task from Invalid sorting (collation problem) to Set $wgCategoryCollation to 'uca-mk' on Macedonian wikis and rebuild category sort keys.Aug 7 2016, 6:27 PM
kaldari reopened this task as Open.

mk is already listed in $tailoringFirstLetters with an empty array, but https://ssl.icu-project.org/trac/browser/icu/trunk/source/data/coll/mk.txt seems to suggest that it may need some tailorings. @Bawolff: Can you take a look at that file and let me know if any first letter tailorings are needed? Also it would nice to document how these first name tailorings are generated, as I still don't understand that part.

Im pretty sure Ѓ and Ќ should be. Usually it should be a capitalized version of whatever is after a single <.

Change 303690 had a related patch set uploaded (by Kaldari):
Updating $tailoringFirstLetters for Macedonian

https://gerrit.wikimedia.org/r/303690

kaldari updated the task description. (Show Details)Aug 9 2016, 4:49 PM

Change 303690 merged by jenkins-bot:
Updating $tailoringFirstLetters for Macedonian Per https://ssl.icu-project.org/trac/browser/icu/trunk/source/data/coll/mk.txt

https://gerrit.wikimedia.org/r/303690

DannyH set the point value for this task to 1.Aug 9 2016, 5:14 PM
DannyH moved this task from To be estimated/discussed to Estimated on the Community-Tech board.
kaldari renamed this task from Set $wgCategoryCollation to 'uca-mk' on Macedonian wikis and rebuild category sort keys to Set $wgCategoryCollation to 'uca-mk-u-kn' on Macedonian wikis and rebuild category sort keys.Aug 9 2016, 7:49 PM

Just talked to the Macedonian editor that is requesting this change and he says that they want to switch to numeric sorting while they are at it. Adjusted the task title accordingly.

The first letter tailoring made the deployment train so they should be deployed to Macedonian Wikipedia this Thursday. Hopefully we can switch the collation and rebuild the sort keys next week.

Change 304851 had a related patch set uploaded (by Kaldari):
Change sorting for mkwiki from uppercase to uca-mk-u-kn

https://gerrit.wikimedia.org/r/304851

Announcement posted to Macedonian Village Pump.

Change 304851 merged by jenkins-bot:
Change sorting for mkwiki from uppercase to uca-mk-u-kn

https://gerrit.wikimedia.org/r/304851

Mentioned in SAL [2016-08-15T23:08:21Z] <dereckson@tin> Synchronized wmf-config/InitialiseSettings.php: Set collation to uca-mk-u-kn on mk.wikipedia (T26953) (duration: 01m 00s)

kaldari closed this task as Resolved.Aug 15 2016, 11:31 PM
kaldari claimed this task.

All done. updateCollation.php script took 14 minutes to run.

DannyH moved this task from Estimated to Archive on the Community-Tech board.