Page MenuHomePhabricator

Set $wgCategoryCollation to 'uca-mk-u-kn' on Macedonian wikis and rebuild category sort keys
Closed, ResolvedPublic1 Estimated Story Points

Description

Author: misoss

Description:
Pages are wrongly sorted on the Macedonian Wikipedia (as well as all other Macedonian Wiki projects). For instance, see: http://mk.wikipedia.org/wiki/%D0%9A%D0%B0%D1%82%D0%B5%D0%B3%D0%BE%D1%80%D0%B8%D1%98%D0%B0:%D0%9A%D0%BB%D0%B0%D1%81%D0%B8%D1%84%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D1%98%D0%B0_%D0%BD%D0%B0_%D0%B3%D0%BB%D0%B0%D0%B2%D0%BD%D0%B8%D1%82%D0%B5_%D1%82%D0%B5%D0%BC%D0%B8

Screenshot:

Screen Shot 2016-08-09 at 9.48.46 AM.png (538×1 px, 124 KB)

The letter „Ј“ is found at the beginning of the list, instead of after „И“. I believe that this is a database problem, namely in the collation scheme used. For Macedonian, utf8_unicode_ci would be appropriate instead of utf8_general_ci.


Version: unspecified
Severity: enhancement
URL: http://mk.wikipedia.org

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:07 PM
bzimport set Reference to bz24953.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

*** This bug has been marked as a duplicate of bug 164 ***

kaldari renamed this task from Invalid sorting (collation problem) to Set $wgCategoryCollation to 'uca-mk' on Macedonian wikis and rebuild category sort keys.Aug 7 2016, 6:27 PM
kaldari reopened this task as Open.

mk is already listed in $tailoringFirstLetters with an empty array, but https://ssl.icu-project.org/trac/browser/icu/trunk/source/data/coll/mk.txt seems to suggest that it may need some tailorings. @Bawolff: Can you take a look at that file and let me know if any first letter tailorings are needed? Also it would nice to document how these first name tailorings are generated, as I still don't understand that part.

Im pretty sure Ѓ and Ќ should be. Usually it should be a capitalized version of whatever is after a single <.

Change 303690 had a related patch set uploaded (by Kaldari):
Updating $tailoringFirstLetters for Macedonian

https://gerrit.wikimedia.org/r/303690

DannyH set the point value for this task to 1.Aug 9 2016, 5:14 PM
DannyH moved this task from To Be Estimated/Discussed to Estimated on the Community-Tech board.
kaldari renamed this task from Set $wgCategoryCollation to 'uca-mk' on Macedonian wikis and rebuild category sort keys to Set $wgCategoryCollation to 'uca-mk-u-kn' on Macedonian wikis and rebuild category sort keys.Aug 9 2016, 7:49 PM

Just talked to the Macedonian editor that is requesting this change and he says that they want to switch to numeric sorting while they are at it. Adjusted the task title accordingly.

The first letter tailoring made the deployment train so they should be deployed to Macedonian Wikipedia this Thursday. Hopefully we can switch the collation and rebuild the sort keys next week.

Change 304851 had a related patch set uploaded (by Kaldari):
Change sorting for mkwiki from uppercase to uca-mk-u-kn

https://gerrit.wikimedia.org/r/304851

Change 304851 merged by jenkins-bot:
Change sorting for mkwiki from uppercase to uca-mk-u-kn

https://gerrit.wikimedia.org/r/304851

Mentioned in SAL [2016-08-15T23:08:21Z] <dereckson@tin> Synchronized wmf-config/InitialiseSettings.php: Set collation to uca-mk-u-kn on mk.wikipedia (T26953) (duration: 01m 00s)

kaldari claimed this task.

All done. updateCollation.php script took 14 minutes to run.