Page MenuHomePhabricator

Broken sorting and multi-page categories for Cyrillic wikis
Closed, ResolvedPublic

Description

Sorting in categories and multi-page category management does not work in Cyrillic wikis.

Steps to reproduce:

  1. Sorting in categories: https://uk.wikipedia.org/wiki/Категорія:Померли_2015
    1. Expected: alphabetic order, Latin S, then Cyrillic А Б В ...
    2. Observed: random order К (cyr), О (cyr), S (lat), А (cyr), Б (cyr), В (cyr) etc.
  2. Multi-page category: https://uk.wikipedia.org/w/index.php?title=Категорія:Померли_2015&pagefrom=Джекі+Коллінз#mw-pages
    1. Expected: next 200 articles starting from Джекі Коллінз
    2. Observed: 200 articles starting from Отто Каріус, with Джекі Коллінз as penultimate article on the list (thus getting wrong 200 articles)

Same for other wikis:

  1. Sorting in Russian https://ru.wikipedia.org/wiki/Категория:Умершие_в_2015_году
    1. Expected: alphabetic order, Latin D, G, H, I, S, then Cyrillic А Б В ...
    2. Observed: random order, М, Р, Э (all cyr), D, G, H, I, S (all lat), А, Б, В (all cyr) etc.
  2. Multi-page categories in Serbian: https://sr.wikipedia.org/w/index.php?title=Категорија:Умрли_2014.&pagefrom=Џери+Конлон#mw-pages
    1. Expected: 200 articles from Џери Конлон
    2. Observed: 200 articles from Ana Marija Matute Auseho

This completely breaks multi-page management (impossible to get access to any other page beyond first), thus this should be unbreak now

Event Timeline

NickK created this task.May 26 2016, 9:43 AM
Restricted Application added subscribers: Zppix, Base, Aklapper. · View Herald TranscriptMay 26 2016, 9:43 AM
NickK triaged this task as Unbreak Now! priority.May 26 2016, 9:43 AM

Categories become unusable with this, thus unbreak now

Restricted Application added subscribers: Luke081515, TerraCodes, Urbanecm. · View Herald TranscriptMay 26 2016, 9:43 AM
NickK updated the task description. (Show Details)May 26 2016, 9:48 AM

My guess is that recently changed (or recently created) articles are not at the right place in these categories and are added at the beginning of the category instead of the corresponding sort key, but this might be a wrong guess.

Did your wiki by any chance recently asked for changing of category sorting (collation)?

Observation from testing:

Removed sortkey from Тарік Азіз
The order in https://uk.wikipedia.org/wiki/Категорія:Померли_2015 happened to be К (cyr), О (cyr), Т (cyr), S (lat), А (cyr), Б (cyr), В (cyr)
Returned proper sortkey
The order in https://uk.wikipedia.org/wiki/Категорія:Померли_2015 happened to be А (cyr), К (cyr), О (cyr), S (lat), Б (cyr), В (cyr), however, Тарік Азіз is first in the list of "А", while it should be after Вілфрід Агбонавбаре due to proper sorting.

Page purge nor blank edit doesn't enforce proper recategorization.

NickK added a comment.EditedMay 26 2016, 11:18 AM

Did your wiki by any chance recently asked for changing of category sorting (collation)?

We asked for it several years ago, and it was done in 2013 ( T43040 ). There were no recent requests for Ukrainian, thus this is definitely not the reason.

Arbnos added a subscriber: Arbnos.May 26 2016, 3:43 PM

I believe Ops is running maintenance/updateCollation.php on all the wikis today. It may be a temporary problem as the script takes anywhere from a couple hours to a couple days to run depending on the size of the wiki.

Yeah, see announcement here: https://lists.wikimedia.org/pipermail/wikitech-l/2016-May/085741.html and information on T86096. This should be fixed by Saturday.

The results in the links in the description seem to be different now, although still not correct. I would suggest giving it 24 hours and see if it is fixed tomorrow.

kaldari lowered the priority of this task from Unbreak Now! to High.May 26 2016, 9:51 PM
Joe added a subscriber: Joe.May 26 2016, 10:11 PM

The script on ruwiki will, by my estimation, start to run tomorrow afternoon and finish sometimes in the evening or on saturday.

The reason is ruwiki is in the database shard, s6, where the two largest wikis affected by the transition reside - frwiki and ruwiki itself.

Unluckily, we can't run more than one wiki in parallel on the same shard since it would risk overloading the database.

Please be patient while we conclude the transition.

Joe edited projects, added HHVM, Operations; removed Regression.May 26 2016, 10:11 PM
Joe added a comment.May 27 2016, 7:01 AM

The script is running on ruwiki now, I've clearly been too pessimistic last night. I'll report when it is done. @NickK is the situation any better now?

Joe added a comment.May 27 2016, 2:17 PM

Script has finished running, and as far as I can see, all pages reported in this ticket are now redered correctly.

matmarex closed this task as Resolved.May 27 2016, 4:11 PM

@NickK, please reopen if you notice any remaining issues.

Thanks, I confirm that the problem is resolved.