Page MenuHomePhabricator

Sort order in Special:ListUsers should be case-insensitive
Open, LowPublic

Description

Author: gnu1742

Description:
Until now Special:ListUsers sorts Users in this way:
AAA, AAB, ..., ABC, ..., AZZZZZZZZZZZ,AAa,AAb,...AZZZZZZZZZZZZz,...,AZzzzzzzzz,...,Aaa,...
In short: the aaa-Usernames, differing only in upper-/lowercase-letters appear in 2^3=8 total different places of the log.

It should be
AAA,AAa,AaA,Aaa,...

The ranking depending on upper-/lowercase-letters makes it nearly impossible for Oversights to do a efficient search for libelous Usernames: The vandal only has to create the same name over and over again with different combinations of upper-lower-case letters and the bureaucrat/oversight has to search in dozens of places in the logfiles.


Version: unspecified
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=26396

Details

Reference
bz24574

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:08 PM
bzimport set Reference to bz24574.
bzimport added a subscriber: Unknown Object (MLST).

This looks like a duplicate to the collation bug 164, because this change the sortorder of the items. The current behaviour is the same as for Categories etc.

(In reply to comment #1)

This looks like a duplicate to the collation bug 164, because this change the
sortorder of the items. The current behaviour is the same as for Categories
etc.

I would call them separate bugs, since the solution currently being worked on for category pages will not fix this bug. (afaik)

They are indeed separate bugs, but this fixing this one would require disproportionate resources for marginal benefit. Suggest WONTFIX.

gnu1742 wrote:

I doubt that the benefit is marginal: It would it make a lot lot more easier for Oversights/Stewards to look for libelous/harassing Usernames to get rid of them.

Could you please elaborate why this fixing requires 'disproportionate resources'?
I do not know the actual implementation in mediawiki, but in every programming language i recently worked in there are sort-algorithms which do not make a difference between 'A' and 'a', so i guess there should be one for php.

Anyway: People don't expect this sorting order. They are used to other orders from their phone-book or printed encyclopedias.

(In reply to comment #4)

I do not know the actual implementation in mediawiki, but in every programming
language i recently worked in there are sort-algorithms which do not make a
difference between 'A' and 'a', so i guess there should be one for php.

Yes, you can sort case-insensitively in PHP just fine, but you can't (efficiently) do so in MySQL. Like with the category thing, we'd have to add a new column to hold the 'normalized' username and sort by that instead. For English 'normalized' can just be all lowercase, but for other languages you'll want to sort accented characters in all sorts of interesting ways.

gnu1742 wrote:

OK, didn't know that the sorting was done in the database. Thanks for the update.
So probably a regexp search on the usernames could be helpful. Do you know if something like this exists (maybe even as an external tool)?

(In reply to comment #6)

OK, didn't know that the sorting was done in the database. Thanks for the
update.
So probably a regexp search on the usernames could be helpful. Do you know if
something like this exists (maybe even as an external tool)?

Regexp searches can't be done 'internally', as you might have guessed. It doesn't exist as an external tool AFAIK, but it shouldn't be too hard to write a toolserver tool that executes wildcard queries (SQL does allow those, they're just kinda slow in most cases) on the database or uses a dump of all user names to run regexes on.

(In reply to comment #2)

(In reply to comment #1)

This looks like a duplicate to the collation bug 164, because this change the
sortorder of the items. The current behaviour is the same as for Categories
etc.

I would call them separate bugs, since the solution currently being worked on
for category pages will not fix this bug. (afaik)

Did it?
I also wonder how this interacts with bug 26396.