Page MenuHomePhabricator

Sort IP addresses by their numerical value in the Special:Investigate Compare results table
Closed, ResolvedPublic

Description

We should sort IP addresses by their numerical value instead of their prettified string value. For example (based on T237300#6215030):

  • string sorting results in this incorrect order 0:0:0:0:0:0:0:1 < 2a06:f500:1714:e8ac:e97:5d42:de10:989e < 192.168.121.1
  • numerical sorting (with IPv6 prefixing) results in this correct order: 192.168.121.1 < 0:0:0:0:0:0:0:1 < 2a06:f500:1714:e8ac:e97:5d42:de10:989e

The tablesorter will sort table cells by their data-sort-value attribute if present, so we can set that attribute to contain a hexadecimal representation of the IP address. We can use IPUtils::toHex, which prefixes IPv6 addresses so they sort after IPv4 addresses.

Acceptance criteria:

  • IPv4 addresses sort as smaller than IPv6 addresses

Note that IP addresses will not sort correctly until after T255693 is solved.

Event Timeline

Change 605330 had a related patch set uploaded (by Tchanders; owner: Tchanders):
[mediawiki/extensions/CheckUser@master] ComparePager: Add sort values to the results for the tablesorter

https://gerrit.wikimedia.org/r/605330

As discussed with @dbarratt, this is replacing T255694 in our sprint. The work is the same, but filing it as this task is easier to follow.

Change 605330 merged by jenkins-bot:
[mediawiki/extensions/CheckUser@master] ComparePager: Add sort values to the results for the tablesorter

https://gerrit.wikimedia.org/r/605330

Note that IPv6 addresses will not sort correctly until after T255693 is solved.

@Tchanders I think this also applies to IPv4, e.g.:

ip_sorting_now.png (272ร—431 px, 6 KB)

For the same reasons as T255693? C0A9A8E3 < C0A9A331 because 8 < 331?

I am also not sure about user agent sorting:

ua_sorting_now.png (192ร—1 px, 16 KB)

Other sorting functions I have tried sort those strings the other way round.

But at this stage I'm so confused I'm not sure what is real anymore :)

@dom_walden You're right - of course, IPv4 are converted to hex too. I've updated the task description accordingly

As for the user agent sorting, I wonder if that's less of a concern, since it's less obvious what the correct way round is? I.e. numbers have a clear order, whereas there are more ways to define the order of strings of letters, numbers and punctuation...

Cells in the Compare table now have their data-sort-value set to the username, IP, user agent or timestamp (as appropriate for the column).

For the IP column, this is set to the hex value of the IP, so may sort weirdly as Thalia notes T257349#6289600.

I will wait for T255693 to test this properly.

As for the user agent sorting, I wonder if that's less of a concern, since it's less obvious what the correct way round is? I.e. numbers have a clear order, whereas there are more ways to define the order of strings of letters, numbers and punctuation...

This sounds reasonable to me. For IPs it would seem desirable to have them in a "proper" order (e.g. so you can find common IP ranges). This doesn't seem necessary for user agents.