Page MenuHomePhabricator

Odd plan for simple query on revisions + page
Closed, ResolvedPublic

Description

Consider this simple query:

SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11;

I'd expect this to have a plan like this:

+------+-------------+----------+--------+----------------------------+---------------+---------+--------------------------+------+-------+
| id   | select_type | table    | type   | possible_keys              | key           | key_len | ref                      | rows | Extra |
+------+-------------+----------+--------+----------------------------+---------------+---------+--------------------------+------+-------+
|    1 | SIMPLE      | revision | index  | rev_page_id,page_timestamp | rev_timestamp | 20      | NULL                     |   11 |       |
|    1 | SIMPLE      | page     | eq_ref | PRIMARY                    | PRIMARY       | 4       | enwiki.revision.rev_page |    1 |       |
+------+-------------+----------+--------+----------------------------+---------------+---------+--------------------------+------+-------+

and on my local machine (10.0.21-MariaDB-3) it does. But on all the WMF db slaves that I've tried, for some reason it insists on using this ridiculous query instead:

+------+-------------+----------+-------+----------------------------+----------------+---------+---------------------+----------+----------------------------------------------+
| id   | select_type | table    | type  | possible_keys              | key            | key_len | ref                 | rows     | Extra                                        |
+------+-------------+----------+-------+----------------------------+----------------+---------+---------------------+----------+----------------------------------------------+
|    1 | SIMPLE      | page     | index | PRIMARY                    | name_title     | 261     | NULL                | 33836157 | Using index; Using temporary; Using filesort |
|    1 | SIMPLE      | revision | ref   | rev_page_id,page_timestamp | page_timestamp | 4       | enwiki.page.page_id |        8 | Using index                                  |
+------+-------------+----------+-------+----------------------------+----------------+---------+---------------------+----------+----------------------------------------------+

I can STRAIGHT_JOIN it to get it to use the sane plan, but IIRC we'd rather avoid STRAIGHT_JOIN these days.

Event Timeline

Anomie raised the priority of this task from to Needs Triage.
Anomie updated the task description. (Show Details)
Anomie added a project: DBA.
Anomie added a subscriber: Anomie.

I can tell you why, I do not know yet how to fix it:

This is the exact CREATE syntax of revision:

CREATE TABLE `revision` (
  `rev_id` int(8) unsigned NOT NULL AUTO_INCREMENT,
  `rev_page` int(8) unsigned NOT NULL DEFAULT '0',
  `rev_text_id` int(8) unsigned NOT NULL DEFAULT '0',
  `rev_comment` varbinary(255) DEFAULT NULL,
  `rev_user` int(5) unsigned NOT NULL DEFAULT '0',
  `rev_user_text` varbinary(255) NOT NULL DEFAULT '',
  `rev_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `rev_minor_edit` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `rev_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `rev_len` int(8) unsigned DEFAULT NULL,
  `rev_parent_id` int(8) unsigned DEFAULT NULL,
  `rev_sha1` varbinary(32) NOT NULL DEFAULT '',
  `rev_content_model` varbinary(32) DEFAULT NULL,
  `rev_content_format` varbinary(64) DEFAULT NULL,
  PRIMARY KEY (`rev_id`),
  KEY `rev_page_id` (`rev_page`,`rev_id`),
  KEY `rev_timestamp` (`rev_timestamp`,`rev_id`),
  KEY `page_timestamp` (`rev_page`,`rev_timestamp`,`rev_id`),
  KEY `user_timestamp` (`rev_user`,`rev_timestamp`,`rev_id`),
  KEY `usertext_timestamp` (`rev_user_text`,`rev_timestamp`,`rev_id`)
) ENGINE=InnoDB AUTO_INCREMENT=683122218 DEFAULT CHARSET=binary

Regular explain:

mysql> EXPLAIN SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: index
possible_keys: PRIMARY
          key: name_title
      key_len: 261
          ref: NULL
         rows: 33836157
        Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: revision
         type: ref
possible_keys: rev_page_id,page_timestamp
          key: page_timestamp
      key_len: 4
          ref: enwiki.page.page_id
         rows: 8
        Extra: Using index
2 rows in set (0.00 sec)

Forcing the other plan:

mysql> EXPLAIN SELECT STRAIGHT_JOIN rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: revision
         type: index
possible_keys: rev_page_id,page_timestamp
          key: rev_timestamp
      key_len: 20
          ref: NULL
         rows: 11
        Extra: 
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: enwiki.revision.rev_page
         rows: 1
        Extra: 
2 rows in set (0.00 sec)

The actual query parsed has no issue:

select `enwiki`.`revision`.`rev_id` AS `rev_id`,`enwiki`.`revision`.`rev_timestamp` AS `rev_timestamp`,`enwiki`.`page`.`page_id` 
AS `page_id`,`enwiki`.`page`.`page_title` AS `page_title`,`enwiki`.`page`.`page_namespace` AS `page_namespace` 
from `enwiki`.`revision` join `enwiki`.`page` where (`enwiki`.`revision`.`rev_page` = `enwiki`.`page`.`page_id`) 
order by `enwiki`.`revision`.`rev_timestamp` desc,`enwiki`.`revision`.`rev_id` desc limit 11

Even ignoring the indexes does not work (it is not related to covering index):

mysql> EXPLAIN EXTENDED SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` IGNORE INDEX(name_title, PRIMARY) ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 33836157
     filtered: 100.00
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: revision
         type: ref
possible_keys: rev_page_id,page_timestamp
          key: page_timestamp
      key_len: 4
          ref: enwiki.page.page_id
         rows: 8
     filtered: 100.00
        Extra: Using index
2 rows in set, 1 warning (0.00 sec)

I think this is a bug on the optimizer. I thought at first that it may be caused by some indexes being just (rev_timestamp), which requires the extended_keys optimization. But here, rev_timestamp is extended with the primary key explicitly, and when suggested, it calculates correctly the cost and not usage of filesort.

There is absolutely no reason to use the wrong order, and the query plan has all the information. Can you reproduce on the exact version we have in production (10.0.16?). Maybe it was fixed on a later version and we need to upgrade. But an 11-row plan getting rejected instead of a index/full scan, and the optimizer knows about it?

jcrespo triaged this task as High priority.

Hmm. If I load my local wiki's dataset into mediawiki-vagrant and 10.0.16-MariaDB-1~trusty from http://archive.mariadb.org, it uses the sane plan. But before that when I had just created a few pages in the mediawiki-vagrant wiki it used the bad plan even with 10.0.21-MariaDB-1~trusty. So I have no idea.

I am currently regenerating the table statistics for page and revison on a depooled slave, and see if that helps, somehow.

After a long ANALYZE TABLE on both page and revision, generating engine-independent statistics I get the same plan:

MariaDB PRODUCTION s1 localhost enwiki > EXPLAIN SELECT rev_id, rev_timestamp, page_id, page_title, page_namespace FROM `revision` INNER JOIN `page` ON ((rev_page = page_id))  ORDER BY rev_timestamp  DESC,rev_id  DESC LIMIT 11\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: index
possible_keys: PRIMARY
          key: name_title
      key_len: 261
          ref: NULL
         rows: 37379569
        Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: revision
         type: ref
possible_keys: page_timestamp,page_user_timestamp
          key: page_timestamp
      key_len: 4
          ref: enwiki.page.page_id
         rows: 16
        Extra: Using index
2 rows in set (0.03 sec)

I would consider this an upstream bug, and will report it.

If this is causing a huge problem, I would recommend adding (temporarily) the STRAIGHT_JOIN, but I agree that is not a good solution. Will report back with the answer I get.

I don't know of any code hitting this currently in production, but new code in https://gerrit.wikimedia.org/r/#/c/241559/ will be using queries almost exactly like this.

As there seems to be no actionable here (the upstream fix doesn't seem it will be available any time soon, and the actual problem was workaround-ed), I will go ahead and close the issue. We can do a review of query flags again on database upgrade.