Page MenuHomePhabricator

Category headers wrong when using uca-default collation
Closed, ResolvedPublic

Description

Everything is categorized under ۞

U+06DE ARABIC START OF RUB EL HIZB


Version: 1.18.x
Severity: normal
URL: http://translatewiki.net/wiki/Category:User_fi

Details

Reference
bz28540

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:27 PM
bzimport set Reference to bz28540.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #1)

Looks like everything has no sort key...
http://translatewiki.net/w/api.php?action=query&list=categorymembers&cmprop=sortkey|title&cmtitle=category:Languages_with_a_Wikiquote_project&cmlimit=max

Nevermind, thats unrelated (api doesn't like non-valid utf-8 strings. Filed separate bug 26614).

Also probably unrelated, looks like CategoryPage doesn't like sortkeys that have nulls in them. Try pressing next 500 on http://translatewiki.net/w/i.php?title=Category:User_en&pagefrom=%29%80%05%09%3C%40%40P%01%0A%01%84%8F%07%00 (Note how the key is cut off directly after the %00 )

(In reply to comment #2)

(In reply to comment #1)

Looks like everything has no sort key...
http://translatewiki.net/w/api.php?action=query&list=categorymembers&cmprop=sortkey|title&cmtitle=category:Languages_with_a_Wikiquote_project&cmlimit=max

Nevermind, thats unrelated (api doesn't like non-valid utf-8 strings. Filed
separate bug 26614).

Also probably unrelated, looks like CategoryPage doesn't like sortkeys that
have nulls in them. Try pressing next 500 on
http://translatewiki.net/w/i.php?title=Category:User_en&pagefrom=%29%80%05%09%3C%40%40P%01%0A%01%84%8F%07%00
(Note how the key is cut off directly after the %00 )

Nevermind, that was totally wrong. The thingy's ending in null was just coincidence.

Anyways, both the first letter headers and paging not working is caused by r83544 - Some parts expected the human readable sortkey, while were getting the binary sortkey, which causes problems on uca-default, but wouldn't really be noticeable on uppercase collation. Reverted that revision in r86100.

Actually, sortkeys with a leading space are still sorting under a U+6DE. Presumably what was happening is all non-graphical characters sort under that, and before the double-encoded sortkey started with a non-graphical character. Anyways splitting that off to a different bug (bug 28545) since its really a separate issue from the all things sorting under the wrong thing.