Page MenuHomePhabricator

Revision Deletion page won't load for too many (10) revisions
Closed, ResolvedPublic

Description

Yesterday, I tried to hide the content of 10 revisions of https://en.wikipedia.org/wiki/User:Cyde/List_of_candidates_for_speedy_deletion/Subpage; I got a database error page three times. Everything worked fine when I did it in 2 loads of 5 each.

Details

Related Gerrit Patches:

Event Timeline

OdMishehu raised the priority of this task from to Needs Triage.
OdMishehu updated the task description. (Show Details)
OdMishehu added a subscriber: OdMishehu.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 30 2015, 11:12 AM
Krenair added a subscriber: Krenair.
jcrespo added a subscriber: jcrespo.Jul 2 2015, 8:03 AM

hide the content of 10 revisions

Can someone translate the user's action to an URL GET/POST for testing/searching the logs?

Well, what happens is the user (logged in with admin/suppression rights) goes to a page's history and checks 10 boxes, and hits 'Change visibility of selected revisions'. It causes a request like this:
GET /w/index.php?title=Main_Page&action=revisiondelete&type=revision&ids%5B2344%5D=1&ids%5B2343%5D=1&ids%5B2342%5D=1&ids%5B2329%5D=1&ids%5B2328%5D=1&ids%5B2327%5D=1&ids%5B2326%5D=1&ids%5B2325%5D=1&ids%5B2324%5D=1&ids%5B2323%5D=1
And the result is a page with some details about these, more potentially relevant deletion logs, and various options for how to do the deletions. Let's say they suppress revision text and editor's username, but leave the edit summary visible, and give the reason 'example'. They hit 'Apply to selected revisions'. It results in a request like this:
POST /w/index.php?title=Special:RevisionDelete&action=submit
With request data looking something like this:
wpHidePrimary=1&wpHideComment=0&wpHideUser=1&wpHideRestricted=0&wpRevDeleteReasonList=other&wpReason=example&wpSubmit=Apply+to+selected+revisions&wpEditToken=1de26ce2d784f7bb9e7e85bcb82a325c5595adb7%2B%5C&target=Main+Page&type=revision&ids=2344%2C2343%2C2342%2C2329%2C2328%2C2327%2C2326%2C2325%2C2324%2C2323
Or, more nicely formatted:

wpHidePrimary:1
wpHideComment:0
wpHideUser:1
wpHideRestricted:0
wpRevDeleteReasonList:other
wpReason:example
wpSubmit:Apply to selected revisions
wpEditToken:1de26ce2d784f7bb9e7e85bcb82a325c5595adb7+\
target:Main Page
type:revision
ids:2344,2343,2342,2329,2328,2327,2326,2325,2324,2323

Obviously, that token will be completely random and this is just an example from my local dev wiki. Does that answer your question, Jaime? Wouldn't an example DB query be more helpful to you?

x-posting @Legoktm's example query from T111478 (duplicate)

{"exception":"[Exception DBQueryError] (/srv/mediawiki/php-1.26wmf21/includes/db/Database.php:1131) A connection error occured. \nQuery: SELECT  rev_id,rev_page,rev_text_id,rev_timestamp,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,rev_content_format,rev_content_model,user_name  FROM `revision` INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user))  WHERE rev_page = '16283969' AND rev_id IN ('679340073','679337192','679336675','679335920','679334110','679334014','679333987','679330491','679330449','679317447')   ORDER BY rev_id DESC \nFunction: RevDelRevisionList::doQuery\nError: 2013 Lost connection to MySQL server during query (10.64.48.28)\n\n[stacktrace]\n#0 /srv/mediawiki/php-1.26wmf21/includes/db/Database.php(1069): DatabaseBase->reportQueryError(string, integer, string, string, boolean)\n#1 /srv/mediawiki/php-1.26wmf21/includes/db/Database.php(1612): DatabaseBase->query(string, string)\n#2 /srv/mediawiki/php-1.26wmf21/includes/revisiondelete/RevDelRevisionList.php(74): DatabaseBase->select(array, array, array, string, array, array)\n#3 /srv/mediawiki/php-1.26wmf21/includes/RevisionList.php(84): RevDelRevisionList->doQuery(DatabaseMysqli)\n#4 /srv/mediawiki/php-1.26wmf21/includes/specials/SpecialRevisiondelete.php(172): RevisionListBase->reset()\n#5 /srv/mediawiki/php-1.26wmf21/includes/specialpage/SpecialPage.php(384): SpecialRevisionDelete->execute(string)\n#6 /srv/mediawiki/php-1.26wmf21/includes/actions/SpecialPageAction.php(77): SpecialPage->run(string)\n#7 /srv/mediawiki/php-1.26wmf21/includes/MediaWiki.php(458): SpecialPageAction->show()\n#8 /srv/mediawiki/php-1.26wmf21/includes/MediaWiki.php(255): MediaWiki->performAction(Article, Title)\n#9 /srv/mediawiki/php-1.26wmf21/includes/MediaWiki.php(682): MediaWiki->performRequest()\n#10 /srv/mediawiki/php-1.26wmf21/includes/MediaWiki.php(476): MediaWiki->main()\n#11 /srv/mediawiki/php-1.26wmf21/index.php(41): MediaWiki->run()\n#12 /srv/mediawiki/w/index.php(3): include(string)\n#13 {main}\n"}
mysql:wikiadmin@db1072 [enwiki]> explain SELECT  rev_id,rev_page,rev_text_id,rev_timestamp,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,rev_content_format,rev_content_model,user_name  FROM `revision` INNER JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user))  WHERE rev_page = '16283969' AND rev_id IN ('679340073','679337192','679336675','679335920','679334110','679334014','679333987','679330491','679330449','679317447')   ORDER BY rev_id DESC;
+------+-------------+----------+--------+------------------------------------+-------------+---------+--------------------------+------+-------------+
| id   | select_type | table    | type   | possible_keys                      | key         | key_len | ref                      | rows | Extra       |
+------+-------------+----------+--------+------------------------------------+-------------+---------+--------------------------+------+-------------+
|    1 | SIMPLE      | page     | const  | PRIMARY                            | PRIMARY     | 4       | const                    |    1 | Using index |
|    1 | SIMPLE      | revision | ref    | PRIMARY,rev_page_id,page_timestamp | rev_page_id | 4       | const                    |   10 | Using where |
|    1 | SIMPLE      | user     | eq_ref | PRIMARY                            | PRIMARY     | 4       | enwiki.revision.rev_user |    1 | Using where |
+------+-------------+----------+--------+------------------------------------+-------------+---------+--------------------------+------+-------------+
3 rows in set (0.00 sec)
Risker added a subscriber: Risker.Nov 18 2015, 12:51 AM
Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptNov 18 2015, 12:51 AM
Legoktm triaged this task as High priority.Nov 18 2015, 12:52 AM

People are reporting this is happening more often.

Legoktm added a subscriber: aaron.Nov 18 2015, 12:53 AM
Keegan added a subscriber: Keegan.Nov 18 2015, 7:35 PM
Teles added a subscriber: Teles.Nov 19 2015, 1:08 AM
Restricted Application added a subscriber: JEumerus. · View Herald TranscriptFeb 24 2016, 5:24 PM

Change 282737 had a related patch set uploaded (by Catrope):
Use 'contributions' database for RevisionList queries

https://gerrit.wikimedia.org/r/282737

I observed the following query in production:

mysql:wikiadmin@db1055 [enwiki]> explain SELECT  rev_id,rev_page,rev_text_id,rev_timestamp,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,rev_content_format,rev_content_model,user_name,(SELECT  GROUP_CONCAT(ct_tag SEPARATOR ',')   FROM `change_tag`   WHERE ct_rev_id=rev_id  ) AS ts_tags  FROM `revision` LEFT  JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user))  WHERE rev_page = '5137507' AND rev_id IN ('714751457','714751447','714751294','714751218','714751046','714750917','714750880','714750816','714750534','714750511')   ORDER BY rev_id  DESC;
+------+--------------------+------------+--------+--------------------------------------------+-----------+---------+--------------------------+------+--------------------------+
| id   | select_type        | table      | type   | possible_keys                              | key       | key_len | ref                      | rows | Extra                    |
+------+--------------------+------------+--------+--------------------------------------------+-----------+---------+--------------------------+------+--------------------------+
|    1 | PRIMARY            | revision   | range  | PRIMARY,page_timestamp,page_user_timestamp | PRIMARY   | 4       | NULL                     |   40 | Using where              |
|    1 | PRIMARY            | user       | eq_ref | PRIMARY                                    | PRIMARY   | 4       | enwiki.revision.rev_user |    1 | Using where              |
|    2 | DEPENDENT SUBQUERY | change_tag | ref    | ct_rev_id,change_tag_rev_tag               | ct_rev_id | 5       | enwiki.revision.rev_id   |    1 | Using where; Using index |
+------+--------------------+------------+--------+--------------------------------------------+-----------+---------+--------------------------+------+--------------------------+
3 rows in set (0.00 sec)

mysql:wikiadmin@db1057 [enwiki]> explain SELECT  rev_id,rev_page,rev_text_id,rev_timestamp,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_deleted,rev_len,rev_parent_id,rev_sha1,rev_content_format,rev_content_model,user_name,(SELECT  GROUP_CONCAT(ct_tag SEPARATOR ',')   FROM `change_tag`   WHERE ct_rev_id=rev_id  ) AS ts_tags  FROM `revision` LEFT  JOIN `user` ON ((rev_user != 0) AND (user_id = rev_user))  WHERE rev_page = '5137507' AND rev_id IN ('714751457','714751447','714751294','714751218','714751046','714750917','714750880','714750816','714750534','714750511')   ORDER BY rev_id  DESC;
+------+--------------------+------------+--------+------------------------------------+-------------+---------+--------------------------+------+--------------------------+
| id   | select_type        | table      | type   | possible_keys                      | key         | key_len | ref                      | rows | Extra                    |
+------+--------------------+------------+--------+------------------------------------+-------------+---------+--------------------------+------+--------------------------+
|    1 | PRIMARY            | revision   | ref    | PRIMARY,rev_page_id,page_timestamp | rev_page_id | 4       | const                    |   10 | Using where              |
|    1 | PRIMARY            | user       | eq_ref | PRIMARY                            | PRIMARY     | 4       | enwiki.revision.rev_user |    1 | Using where              |
|    2 | DEPENDENT SUBQUERY | change_tag | ref    | ct_rev_id,change_tag_rev_tag       | ct_rev_id   | 5       | enwiki.revision.rev_id   |    1 | Using where; Using index |
+------+--------------------+------------+--------+------------------------------------+-------------+---------+--------------------------+------+--------------------------+
3 rows in set (0.00 sec)

Without EXPLAIN, the query completes immediately on db1055 but not on db1057. I think this may be due to the partitions on the revision table that are present on db1055 but not on db1057:

mysql:wikiadmin@db1055 [enwiki]> show create table revision \G
*************************** 1. row ***************************
       Table: revision
Create Table: CREATE TABLE `revision` (
  `rev_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
  `rev_page` int(8) unsigned NOT NULL DEFAULT '0',
  `rev_text_id` int(8) unsigned NOT NULL DEFAULT '0',
  `rev_comment` varbinary(255) DEFAULT NULL,
  `rev_user` int(5) unsigned NOT NULL DEFAULT '0',
  `rev_user_text` varbinary(255) NOT NULL DEFAULT '',
  `rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `rev_minor_edit` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `rev_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `rev_len` int(8) unsigned DEFAULT NULL,
  `rev_parent_id` int(8) unsigned DEFAULT NULL,
  `rev_sha1` varbinary(32) NOT NULL DEFAULT '',
  `rev_content_model` varbinary(32) DEFAULT NULL,
  `rev_content_format` varbinary(64) DEFAULT NULL,
  PRIMARY KEY (`rev_id`,`rev_user`),
  KEY `rev_timestamp` (`rev_timestamp`,`rev_id`),
  KEY `page_timestamp` (`rev_page`,`rev_timestamp`,`rev_id`),
  KEY `user_timestamp` (`rev_user`,`rev_timestamp`,`rev_id`),
  KEY `usertext_timestamp` (`rev_user_text`,`rev_timestamp`,`rev_id`),
  KEY `page_user_timestamp` (`rev_page`,`rev_user`,`rev_timestamp`,`rev_id`)
) ENGINE=InnoDB AUTO_INCREMENT=714777082 DEFAULT CHARSET=binary
/*!50100 PARTITION BY RANGE (rev_user)
(PARTITION p1 VALUES LESS THAN (1) ENGINE = InnoDB,
 PARTITION p100000 VALUES LESS THAN (100000) ENGINE = InnoDB,
 PARTITION p200000 VALUES LESS THAN (200000) ENGINE = InnoDB,
 PARTITION p300000 VALUES LESS THAN (300000) ENGINE = InnoDB,
 PARTITION p400000 VALUES LESS THAN (400000) ENGINE = InnoDB,
 PARTITION p500000 VALUES LESS THAN (500000) ENGINE = InnoDB,
 PARTITION p700000 VALUES LESS THAN (750000) ENGINE = InnoDB,
 PARTITION p1000000 VALUES LESS THAN (1000000) ENGINE = InnoDB,
 PARTITION p1200000 VALUES LESS THAN (1500000) ENGINE = InnoDB,
 PARTITION p2000000 VALUES LESS THAN (2000000) ENGINE = InnoDB,
 PARTITION p3000000 VALUES LESS THAN (3000000) ENGINE = InnoDB,
 PARTITION p4000000 VALUES LESS THAN (4000000) ENGINE = InnoDB,
 PARTITION p5000000 VALUES LESS THAN (5000000) ENGINE = InnoDB,
 PARTITION p6000000 VALUES LESS THAN (6000000) ENGINE = InnoDB,
 PARTITION p7000000 VALUES LESS THAN (7000000) ENGINE = InnoDB,
 PARTITION p8000000 VALUES LESS THAN (8000000) ENGINE = InnoDB,
 PARTITION p9000000 VALUES LESS THAN (9000000) ENGINE = InnoDB,
 PARTITION p10000000 VALUES LESS THAN (10000000) ENGINE = InnoDB,
 PARTITION p12000000 VALUES LESS THAN (12000000) ENGINE = InnoDB,
 PARTITION p14000000 VALUES LESS THAN (14000000) ENGINE = InnoDB,
 PARTITION p16000000 VALUES LESS THAN (16000000) ENGINE = InnoDB,
 PARTITION p18000000 VALUES LESS THAN (18000000) ENGINE = InnoDB,
 PARTITION pMAXVALUE VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)

mysql:wikiadmin@db1057 [enwiki]> show create table revision \G
*************************** 1. row ***************************
       Table: revision
Create Table: CREATE TABLE `revision` (
  `rev_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
  `rev_page` int(8) unsigned NOT NULL DEFAULT '0',
  `rev_text_id` int(8) unsigned NOT NULL DEFAULT '0',
  `rev_comment` varbinary(255) DEFAULT NULL,
  `rev_user` int(5) unsigned NOT NULL DEFAULT '0',
  `rev_user_text` varbinary(255) NOT NULL DEFAULT '',
  `rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `rev_minor_edit` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `rev_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `rev_len` int(8) unsigned DEFAULT NULL,
  `rev_parent_id` int(8) unsigned DEFAULT NULL,
  `rev_sha1` varbinary(32) NOT NULL DEFAULT '',
  `rev_content_model` varbinary(32) DEFAULT NULL,
  `rev_content_format` varbinary(64) DEFAULT NULL,
  PRIMARY KEY (`rev_id`),
  KEY `rev_page_id` (`rev_page`,`rev_id`),
  KEY `rev_timestamp` (`rev_timestamp`,`rev_id`),
  KEY `page_timestamp` (`rev_page`,`rev_timestamp`,`rev_id`),
  KEY `user_timestamp` (`rev_user`,`rev_timestamp`,`rev_id`),
  KEY `usertext_timestamp` (`rev_user_text`,`rev_timestamp`,`rev_id`)
) ENGINE=InnoDB AUTO_INCREMENT=714777100 DEFAULT CHARSET=binary
1 row in set (0.00 sec)

The differences in indexes are weird too. The rev_page_id index exists on db1057 but not db1055, which could be a problem especially if FORCE INDEX is used. It concerns me that the primary key differs: I appreciate that we might want a (rev_id, rev_user) index on db1055, but the way it's done here means that uniqueness of rev_id is required on db1057 but not db1055. rev_id is an AUTO_INCREMENT field so it might still be fine (and presumably we'll never rotate a server with these extra indexes into the master position?), but it still scares me a bit.

The differences in indexes are weird too. The rev_page_id index exists on db1057 but not db1055,

It turns out this is the problem. The query looks like WHERE rev_page_id = X AND rev_id IN (A, B, C, ...) ORDER BY rev_id. db1057 has an index on (rev_page, rev_id), so it uses that, which appears to make sense but is actually horrendously slow for reasons I don't understand. db1055 does not have this index, so it falls back on using the PRIMARY index on (rev_id, rev_user) to look up the rows by rev_id (it doesn't realize this is a unique constraint(!)) and then manually examines the rev_page of each row and discards the ones with the wrong rev_page value (which in a sane universe is hopefully none of them). It makes sense that db1055's approach is fast, but it makes no sense to me that db1057's approach is slow.

If I add FORCE INDEX(PRIMARY) for the revision table, the query is also fast on db1057 (PRIMARY is a different index there than on db1055(!), but in this case that doesn't matter).

I didn't run the query to completion (it was taking too long), but after letting it run for a bit on db1055 I get this:

mysql:wikiadmin@db1057 [enwiki]> SHOW SESSION STATUS like 'Hand%';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_commit             | 0      |
| Handler_delete             | 0      |
| Handler_discover           | 0      |
| Handler_external_lock      | 0      |
| Handler_icp_attempts       | 0      |
| Handler_icp_match          | 0      |
| Handler_mrr_init           | 0      |
| Handler_mrr_key_refills    | 0      |
| Handler_mrr_rowid_refills  | 0      |
| Handler_prepare            | 0      |
| Handler_read_first         | 0      |
| Handler_read_key           | 17     |
| Handler_read_last          | 0      |
| Handler_read_next          | 0      |
| Handler_read_prev          | 180328 |
| Handler_read_rnd           | 0      |
| Handler_read_rnd_deleted   | 0      |
| Handler_read_rnd_next      | 168    |
| Handler_rollback           | 1      |
| Handler_savepoint          | 0      |
| Handler_savepoint_rollback | 0      |
| Handler_tmp_update         | 0      |
| Handler_tmp_write          | 166    |
| Handler_update             | 0      |
| Handler_write              | 0      |
+----------------------------+--------+

Combined with the EXPLAIN saying the key length is only 4

+------+--------------------+------------+--------+------------------------------------+-------------+---------+--------------------------+------+--------------------------+
| id   | select_type        | table      | type   | possible_keys                      | key         | key_len | ref                      | rows | Extra                    |
+------+--------------------+------------+--------+------------------------------------+-------------+---------+--------------------------+------+--------------------------+
|    1 | PRIMARY            | revision   | ref    | PRIMARY,rev_page_id,page_timestamp | rev_page_id | 4       | const                    |   10 | Using where              |

I think that it's for some reason only using the rev_page prefix of the index then scanning all 902730 matching rows for matching rev_id instead of using the whole index.

It also seems to not do it when the ORDER BY is changed to ASC or is omitted entirely. Or if we ORDER BY rev_page DESC, rev_id DESC, but then it uses a silly plan (Using index condition; Using where; Using filesort) which is a known issue with MySQL.

If I add FORCE INDEX(PRIMARY) for the revision table, the query is also fast on db1057 (PRIMARY is a different index there than on db1055(!), but in this case that doesn't matter).

Per T132416: Rampant differences in indexes on enwiki.revision across the DB cluster, this won't work as a general strategy because PRIMARY can mean three different things depending on which DB you hit. On normal eqiad slaves, it's (rev_id), which is fine for our query. On contributions slaves, it's (rev_id, rev_user) which is still fine for this query. However, on non-contributions slaves in codfw, it's (rev_page, rev_id) which is the index we're trying to avoid in the first place.

It also seems to not do it when the ORDER BY is changed to ASC or is omitted entirely. Or if we ORDER BY rev_page DESC, rev_id DESC, but then it uses a silly plan (Using index condition; Using where; Using filesort) which is a known issue with MySQL.

Maybe it's the same bug conceptually? In both cases, we're ordering by a field that's specified as a constant or a set of constants.

The "set of constants" case has historically been fine, it's only when you're limiting a column to a single constant that you have to omit it from the ORDER BY to prevent MySQL from getting confused.

If I add FORCE INDEX(PRIMARY) for the revision table, the query is also fast on db1057 (PRIMARY is a different index there than on db1055(!), but in this case that doesn't matter).

Per T132416: Rampant differences in indexes on enwiki.revision across the DB cluster, this won't work as a general strategy because PRIMARY can mean three different things depending on which DB you hit. On normal eqiad slaves, it's (rev_id), which is fine for our query. On contributions slaves, it's (rev_id, rev_user) which is still fine for this query. However, on non-contributions slaves in codfw, it's (rev_page, rev_id) which is the index we're trying to avoid in the first place.

It turns out that this does actually work as a general strategy. I tested the query on one server in each of the three groups (i.e. with each of the three different PKs), and it's fast on all of them if the index is forced. It seems that, for some reason, the optimizer bug isn't triggered if (rev_page, rev_id) is the primary key, only if it's a non-primary (and maybe also non-unique?) index.

I've repurposed my patch to add FORCE INDEX instead of sending these queries to the contributions slaves: https://gerrit.wikimedia.org/r/#/c/282737

Catrope added a comment.EditedSep 10 2016, 2:39 AM

I tested the query on one server in each of the three groups (i.e. with each of the three different PKs), and it's fast on all of them if the index is forced.

I also wrote a script to try it on all enwiki DB servers. Without the forced index, it times out (takes >2s) on db2034, db1057, db1065, db1066 and db1072. With the forced index it doesn't time out on any of them.

Change 282737 merged by jenkins-bot:
Force primary index for RevDelRevisionList query

https://gerrit.wikimedia.org/r/282737

aaron closed this task as Resolved.Sep 11 2016, 12:52 AM
aaron claimed this task.