Page MenuHomePhabricator

Investigate wrong entry order in the user contributions list
Closed, ResolvedPublic5 Story Points

Description

This task is to investigate the wrong order of entries in the contributions list. See parent ticket. Might be related to T46841 .

Event Timeline

Aklapper changed the edit policy from "Custom Policy" to "All Users".Sep 17 2018, 5:47 PM
Aklapper changed Risk Rating from N/A to default.

The reason for the bad order can be seen here: https://phabricator.wikimedia.org/source/mediawiki/browse/master/includes/specials/pagers/ContribsPager.php$358.

Special:Contributions uses the ContribsPager mentioned above. The SQL query ends doing a trivial ORDER BY rev_timestamp (or ORDER BY ipc_rev_timestamp for edits made by IP contributors). Since the timestamps are not guaranteed to be unique, the order of two entries with the exact same timestamp is effectively random. This is what we see in the example https://commons.wikimedia.org/w/index.php?limit=50&title=Special%3AContributions&contribs=user&target=4nn1l2&namespace=6&start=2018-06-26&end=2018-06-26 (the description page of "File:Modarres Expressway, Tehran 20110912.JPG" is edited first, and then imported).

Possible ways forward:

  • ContribsPager is technically badly implemented. The documentation of IndexPager::getIndexField states "it's really not a good idea to use a non-unique index for this! That won't page right." But this is what ContribsPager does. Timestamps aren't unique.
    • IndexPager::getIndexField can not return more than one field. The code mentions an array feature, but this is for different sort orders.
    • The only truly unique field on the revision table is rev_id (and ipc_rev_id on the ip_changes table). However, I don't think it's possible to ORDER BY this ID as it is not really guaranteed to be in increasing order, as far as I know.
  • ContribsPager currently does not implement IndexPager::getExtraSortFields. It might be possible to use this to provide rev_id as a fall-back ORDER BY key.
    • I wondered if the same if-else as in getIndexField is needed, choosing between ipc_rev_id and rev_id. Answer: No. The rev_id field is always available. The ipc_rev_id field is only available when a JOIN with ip_changes is made. This join is performed on 'LEFT JOIN', [ 'ipc_rev_id = rev_id' ], which means the two fields behave entirely identical within this query.

Change 480081 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/core@master] Fix order on Special:Contributions when timestamps are identical

https://gerrit.wikimedia.org/r/480081

thiemowmde triaged this task as Low priority.
thiemowmde moved this task from Sprint Backlog to Review on the WMDE-QWERTY-Sprint-2018-12-04 board.
Anomie added a subscriber: Anomie.Dec 18 2018, 3:09 PM
  • I wondered if the same if-else as in getIndexField is needed, choosing between ipc_rev_id and rev_id. Answer: No. The rev_id field is always available. The ipc_rev_id field is only available when a JOIN with ip_changes is made. This join is performed on 'LEFT JOIN', [ 'ipc_rev_id = rev_id' ], which means the two fields behave entirely identical within this query.

They don't behave identically, although I'm not sure whether the difference actually matters much since it filesorts either way. Ordering by ipc_rev_id rather than rev_id avoids using a temporary and touches fewer rows.

There's no avoiding the filesort, BTW, unless you want to sort the results by IP then timestamp instead of only by timestamp.

ORDER BY rev_id
wikiadmin@10.64.0.92(enwiki)> explain SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,ipc_rev_timestamp,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` INNER JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user)) LEFT JOIN `ip_changes` ON ((ipc_rev_id = rev_id))   WHERE (ipc_hex BETWEEN '7F000000' AND '7F0000FF') AND ((rev_deleted & 12) != 12)  ORDER BY ipc_rev_timestamp DESC, rev_id DESC LIMIT 51;
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+-----------------------------------------------------------+
| id   | select_type        | table               | type   | possible_keys                                                         | key                   | key_len | ref                                           | rows | Extra                                                     |
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+-----------------------------------------------------------+
|    1 | PRIMARY            | ip_changes          | range  | PRIMARY,ipc_hex_time                                                  | ipc_hex_time          | 37      | NULL                                          |  144 | Using where; Using index; Using temporary; Using filesort |
|    1 | PRIMARY            | revision            | eq_ref | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id                | PRIMARY               | 4       | enwiki.ip_changes.ipc_rev_id                  |    1 | Using where                                               |
|    1 | PRIMARY            | page                | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.revision.rev_page                      |    1 |                                                           |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.revision.rev_user                      |    1 | Using where                                               |
|    1 | PRIMARY            | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                                | PRIMARY               | 4       | enwiki.ip_changes.ipc_rev_id                  |    1 | Using index                                               |
|    1 | PRIMARY            | comment_rev_comment | eq_ref | PRIMARY                                                               | PRIMARY               | 8       | enwiki.temp_rev_comment.revcomment_comment_id |    1 |                                                           |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_rev_tag_id,change_tag_tag_id_id,change_tag_rev_tag_nonuniq | change_tag_rev_tag_id | 5       | enwiki.revision.rev_id                        |    1 | Using where; Using index                                  |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.change_tag.ct_tag_id                   |    1 |                                                           |
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+-----------------------------------------------------------+
8 rows in set (0.00 sec)

wikiadmin@10.64.0.92(enwiki)> pager > /dev/null; SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,ipc_rev_timestamp,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` INNER JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user)) LEFT JOIN `ip_changes` ON ((ipc_rev_id = rev_id))   WHERE (ipc_hex BETWEEN '7F000000' AND '7F0000FF') AND ((rev_deleted & 12) != 12)  ORDER BY ipc_rev_timestamp DESC, rev_id DESC LIMIT 51; nopager; SHOW STATUS LIKE 'Hand%';
PAGER set to '> /dev/null'
51 rows in set (0.01 sec)

PAGER set to stdout
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 2     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 0     |
| Handler_icp_attempts       | 0     |
| Handler_icp_match          | 0     |
| Handler_mrr_init           | 0     |
| Handler_mrr_key_refills    | 0     |
| Handler_mrr_rowid_refills  | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 714   |
| Handler_read_last          | 0     |
| Handler_read_next          | 322   |
| Handler_read_prev          | 0     |
| Handler_read_retry         | 0     |
| Handler_read_rnd           | 51    |
| Handler_read_rnd_deleted   | 0     |
| Handler_read_rnd_next      | 144   |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_tmp_update         | 0     |
| Handler_tmp_write          | 143   |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
26 rows in set (0.16 sec)
ORDER BY ipc_rev_id
wikiadmin@10.64.0.92(enwiki)> explain SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,ipc_rev_timestamp,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` INNER JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user)) LEFT JOIN `ip_changes` ON ((ipc_rev_id = rev_id))   WHERE (ipc_hex BETWEEN '7F000000' AND '7F0000FF') AND ((rev_deleted & 12) != 12)  ORDER BY ipc_rev_timestamp DESC, ipc_rev_id DESC LIMIT 51;
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+------------------------------------------+
| id   | select_type        | table               | type   | possible_keys                                                         | key                   | key_len | ref                                           | rows | Extra                                    |
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+------------------------------------------+
|    1 | PRIMARY            | ip_changes          | range  | PRIMARY,ipc_hex_time                                                  | ipc_hex_time          | 37      | NULL                                          |  144 | Using where; Using index; Using filesort |
|    1 | PRIMARY            | revision            | eq_ref | PRIMARY,page_timestamp,page_user_timestamp,rev_page_id                | PRIMARY               | 4       | enwiki.ip_changes.ipc_rev_id                  |    1 | Using where                              |
|    1 | PRIMARY            | page                | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.revision.rev_page                      |    1 |                                          |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.revision.rev_user                      |    1 | Using where                              |
|    1 | PRIMARY            | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                                | PRIMARY               | 4       | enwiki.ip_changes.ipc_rev_id                  |    1 | Using index                              |
|    1 | PRIMARY            | comment_rev_comment | eq_ref | PRIMARY                                                               | PRIMARY               | 8       | enwiki.temp_rev_comment.revcomment_comment_id |    1 |                                          |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_rev_tag_id,change_tag_tag_id_id,change_tag_rev_tag_nonuniq | change_tag_rev_tag_id | 5       | enwiki.revision.rev_id                        |    1 | Using where; Using index                 |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.change_tag.ct_tag_id                   |    1 |                                          |
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+------------------------------------------+
8 rows in set (0.00 sec)

wikiadmin@10.64.0.92(enwiki)> pager > /dev/null; SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,ipc_rev_timestamp,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` INNER JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`  FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user)) LEFT JOIN `ip_changes` ON ((ipc_rev_id = rev_id))   WHERE (ipc_hex BETWEEN '7F000000' AND '7F0000FF') AND ((rev_deleted & 12) != 12)  ORDER BY ipc_rev_timestamp DESC, ipc_rev_id DESC LIMIT 51; nopager; SHOW STATUS LIKE 'Hand%';
PAGER set to '> /dev/null'
51 rows in set (0.00 sec)

PAGER set to stdout
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 2     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 0     |
| Handler_icp_attempts       | 0     |
| Handler_icp_match          | 0     |
| Handler_mrr_init           | 0     |
| Handler_mrr_key_refills    | 0     |
| Handler_mrr_rowid_refills  | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 278   |
| Handler_read_last          | 0     |
| Handler_read_next          | 222   |
| Handler_read_prev          | 0     |
| Handler_read_retry         | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_deleted   | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_tmp_update         | 0     |
| Handler_tmp_write          | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
26 rows in set (0.15 sec)

I'm not very good in reading these EXPLAIN outputs. Can you help us understand what the consequence of what we see in there might be?

What I meant with "behaving identical" is that there is no way the results can differ, because the two fields are by definition identical.

My assumption was that the bulk of what the ORDER BY needs to do will still be done based on the rev_timestamp (or ipc_rev_timestamp) column. Both have an index! Only rows with identical timestamps need to be ordered by the ID column. I expected the query optimizer to figure this out. If not, is there a way to give it a hint to do exactly that?

I noticed both of the queries listed above contain a JOIN with ip_changes. I wonder how the query behaves without this JOIN? (It must use rev_id then.) Are you able to provide that third EXPLAIN output?

I'm not very good in reading these EXPLAIN outputs. Can you help us understand what the consequence of what we see in there might be?

In this case, the only difference in the EXPLAIN is the "using temporary" annotation. This means that MySQL MariaDB has to collect the result rows (i.e. all the selected columns) into a temporary table in order to sort them. When ordering by ipc_rev_id instead, it can manage to do the sorting without having to collect the whole row first.

You can get more details about the EXPLAIN output at https://dev.mysql.com/doc/refman/8.0/en/explain-output.html.

I also posted the output of SHOW STATUS LIKE 'Hand%' after executing both versions of the query, which gives us some insight as to how the specific query was actually processed. The numbers are complicated a bit by the joins of all the other tables, but we can see that the rev_id version did write 143 rows to the temporary table (Handler_tmp_write) and we also see higher numbers for some of the other properties as well. Since there are only 144 rows total that match this query, that's a little concerning and we might want to try it again with ranges that have more edits to see how it behaves.

For 85.0.0.0/16, which has 5520 edits on enwiki, the relevant bits are:

Variable_namerev_idipc_rev_id
Handler_read_key24363223
Handler_read_next112645570
Handler_read_rnd510
Handler_read_rnd_next55210
Handler_tmp_write55200

It looks like ordering by ipc_rev_id is able to go through just the ip_changes.ipc_hex_time index to identify the rows it will need to return, and then pretty efficiently pull out just the rows it needs from the other tables. While ordering by rev_id it's pulling the data from all the tables for all 5520 rows to put into that temporary table. With warm caches, the ipc_rev_id ordering returned in 0.02 seconds while the rev_id ordering took 0.30 seconds.

BTW, you don't necessarily have to change the ORDER BY clause to fix this. ORDER BY uses result field aliases from the SELECT in preference to fields from the underlying tables, so if you change the query to SELECT ipc_rev_id AS rev_id, ... ORDER BY ipc_rev_timestamp DESC, rev_id DESC LIMIT 51 it'll actually do the more efficient plan ordering by ipc_rev_id.

What I meant with "behaving identical" is that there is no way the results can differ, because the two fields are by definition identical.

Yes, they give identical results. The performance behavior differs though.

My assumption was that the bulk of what the ORDER BY needs to do will still be done based on the rev_timestamp (or ipc_rev_timestamp) column. Both have an index! Only rows with identical timestamps need to be ordered by the ID column. I expected the query optimizer to figure this out. If not, is there a way to give it a hint to do exactly that?

The index on (rev_timestamp) doesn't generally matter here, as to use that it would have to start going through all the edits to find the few made by the user of interest. It's much faster to use the index on (rev_user,rev_timestamp) or (rev_user_text,rev_timestamp) (or, soon, (rev_actor,rev_timestamp)) to look through only the edits by the user of interest, in order by timestamp.

In the ip_changes case the index on (ipc_rev_timestamp) similarly doesn't matter. It's still faster to use the (ipc_hex,ipc_rev_timestamp) index, even though it has to pull all the rows for all the IPs in the range and re-sort them by timestamp to find the ones.

BTW, with InnoDB tables the table's primary key is implicitly appended to every other index, so that index on ip_changes is really (ipc_hex,ipc_rev_timestamp,ipc_rev_id) and it can take advantage of that if your ORDER BY matches. But for good and bad reasons it doesn't match when you're ordering by rev_id rather than ipc_rev_id, resulting in the different query behavior.

I noticed both of the queries listed above contain a JOIN with ip_changes. I wonder how the query behaves without this JOIN? (It must use rev_id then.) Are you able to provide that third EXPLAIN output?

If you don't search on an IP range, then it doesn't use ip_changes at all and things look like this:

wikiadmin@10.64.0.92(enwiki)> explain SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` INNER JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`  FROM `revision` FORCE INDEX (usertext_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user))   WHERE ((rev_user_text = '127.0.0.1')) AND ((rev_deleted & 12) != 12)  ORDER BY rev_timestamp DESC, rev_id DESC LIMIT 51;
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+--------------------------+
| id   | select_type        | table               | type   | possible_keys                                                         | key                   | key_len | ref                                           | rows | Extra                    |
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+--------------------------+
|    1 | PRIMARY            | revision            | ref    | usertext_timestamp                                                    | usertext_timestamp    | 257     | const                                         |  144 | Using where              |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.revision.rev_user                      |    1 | Using where              |
|    1 | PRIMARY            | page                | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.revision.rev_page                      |    1 |                          |
|    1 | PRIMARY            | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                                | PRIMARY               | 4       | enwiki.revision.rev_id                        |    1 | Using index              |
|    1 | PRIMARY            | comment_rev_comment | eq_ref | PRIMARY                                                               | PRIMARY               | 8       | enwiki.temp_rev_comment.revcomment_comment_id |    1 |                          |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_rev_tag_id,change_tag_tag_id_id,change_tag_rev_tag_nonuniq | change_tag_rev_tag_id | 5       | enwiki.revision.rev_id                        |    1 | Using where; Using index |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                               | PRIMARY               | 4       | enwiki.change_tag.ct_tag_id                   |    1 |                          |
+------+--------------------+---------------------+--------+-----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------+------+--------------------------+
7 rows in set (0.00 sec)

wikiadmin@10.64.0.92(enwiki)> pager > /dev/null; SELECT  rev_id,rev_page,rev_timestamp,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,comment_rev_comment.comment_text AS `rev_comment_text`,comment_rev_comment.comment_data AS `rev_comment_data`,comment_rev_comment.comment_id AS `rev_comment_cid`,rev_user,rev_user_text,NULL AS `rev_actor`,page_namespace,page_title,page_id,page_latest,page_is_redirect,page_len,user_name,page_is_new,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` INNER JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_rev_id=rev_id  ) AS `ts_tags`  FROM `revision` FORCE INDEX (usertext_timestamp) JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user))   WHERE ((rev_user_text = '127.0.0.1')) AND ((rev_deleted & 12) != 12)  ORDER BY rev_timestamp DESC, rev_id DESC LIMIT 51; nopager; SHOW STATUS LIKE 'Hand%';
PAGER set to '> /dev/null'
51 rows in set (0.00 sec)

PAGER set to stdout
+----------------------------+-------+
| Variable_name              | Value |
+----------------------------+-------+
| Handler_commit             | 2     |
| Handler_delete             | 0     |
| Handler_discover           | 0     |
| Handler_external_lock      | 0     |
| Handler_icp_attempts       | 0     |
| Handler_icp_match          | 0     |
| Handler_mrr_init           | 0     |
| Handler_mrr_key_refills    | 0     |
| Handler_mrr_rowid_refills  | 0     |
| Handler_prepare            | 0     |
| Handler_read_first         | 0     |
| Handler_read_key           | 227   |
| Handler_read_last          | 0     |
| Handler_read_next          | 78    |
| Handler_read_prev          | 50    |
| Handler_read_retry         | 0     |
| Handler_read_rnd           | 0     |
| Handler_read_rnd_deleted   | 0     |
| Handler_read_rnd_next      | 0     |
| Handler_rollback           | 0     |
| Handler_savepoint          | 0     |
| Handler_savepoint_rollback | 0     |
| Handler_tmp_update         | 0     |
| Handler_tmp_write          | 0     |
| Handler_update             | 0     |
| Handler_write              | 0     |
+----------------------------+-------+
26 rows in set (0.15 sec)

There's nothing much interesting to comment on there, beyond that it doesn't even have to filesort since the WHERE constants and ORDER BY exactly match the index on revision that it uses. Also direct comparison between this and the earlier queries is easy to make because all 144 edits in 127.0.0.0/24 do in fact belong to 127.0.0.1.

Change 480762 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/FileImporter@master] Fix unpredictable history order by enforcing >=1 second difference

https://gerrit.wikimedia.org/r/480762

Change 480762 merged by jenkins-bot:
[mediawiki/extensions/FileImporter@master] Fix unpredictable history order by enforcing >=1 second difference

https://gerrit.wikimedia.org/r/480762

Thanks a lot for the incredibly helpful insight!

[…] if you change the query to SELECT ipc_rev_id AS rev_id, ... ORDER BY ipc_rev_timestamp DESC, rev_id DESC LIMIT 51 it'll actually do the more efficient plan ordering by ipc_rev_id.

I changed https://gerrit.wikimedia.org/r/480081 to do exactly that. What do you think?

You're welcome! I think the patch looks good now, we can finish it up in code review.

Change 480081 merged by jenkins-bot:
[mediawiki/core@master] Fix order on Special:Contributions when timestamps are identical

https://gerrit.wikimedia.org/r/480081

@Lea_WMDE, this is hard to test because the bad order was essentially a random result.

Lea_WMDE closed this task as Resolved.Jan 16 2019, 8:52 AM
Lea_WMDE moved this task from Demo to Done on the WMDE-QWERTY-X-Mas-Sprint-2018-12-18 board.