Steps to Reproduce
Example: category "Social networks" https://lt.wikipedia.org/wiki/Kategorija:Socialiniai_tinklai has an article "Youtube" listed under I (capital i) while it should be under Y.
Steps to Reproduce
Example: category "Social networks" https://lt.wikipedia.org/wiki/Kategorija:Socialiniai_tinklai has an article "Youtube" listed under I (capital i) while it should be under Y.
Hi @Nomad, thanks for taking the time to report this and welcome to Wikimedia Phabricator!
Quick summary / some background:
Looking at wgCategoryCollation in https://phabricator.wikimedia.org/source/mediawiki-config/browse/master/wmf-config/InitialiseSettings.php : 'ltwiki' => 'uca-lt', // T123627. T123627 and https://en.wikipedia.org/wiki/Lithuanian_orthography#Alphabet state Ii Įį Yy Jj.
uca-lt is defined in https://github.com/unicode-org/cldr/blob/master/common/collation/lt.xml
@Nomad: Feel free to write a software patch (in the operations/mediawiki-config repository, see last comment) if you'd like this to happen faster. Thanks!
See https://www.mediawiki.org/wiki/Bug_management/Development_prioritization#Why_has_nobody_fixed_this_issue_yet%3F for general info.
If it's a bug, then the bug is in the ICU sort order not in MediaWiki's first letter identification.
> $collator = new Collator('lt'); > print $collator->compare('YouTube', 'Ixia'); -1 > print $collator->compare('YouTube', 'Instagram'); 1
So a sorted list of those three words would be: Instagram, YouTube, Ixia. If we had separate headings for I and Y then the sections would be split, with duplicate headings for letter I:
== I == * Instagram == Y == * YouTube == I == * Ixia
That's why I and Y need to be combined into a single section.