Page MenuHomePhabricator

Odd index choices for queries on imagelinks table
Open, MediumPublic

Description

Consider the following query against enwiki:

SELECT * FROM imagelinks WHERE il_to = 'Wiktionary-logo-en.svg'  ORDER BY il_from LIMIT 6;

The obvious choice of index would be il_to, it can do a 'ref' to get the correct rows in the correct order.

But for some reason it doesn't do that. Instead it seems to prefer an index scan using the il_from index:

> explain SELECT * FROM imagelinks WHERE il_to = 'Wiktionary-logo-en.svg'  ORDER BY il_from LIMIT 6;
stdClass Object
(
    [id] => 1
    [select_type] => SIMPLE
    [table] => imagelinks
    [type] => index
    [possible_keys] => il_to,il_backlinks_namespace
    [key] => il_from
    [key_len] => 261
    [ref] => 
    [rows] => 4465
    [Extra] => Using where
)

And if we crank up the limit, it still doesn't make a good choice:

> explain SELECT * FROM imagelinks WHERE il_to = 'Wiktionary-logo-en.svg'  ORDER BY il_from LIMIT 5001;
stdClass Object
(
    [id] => 1
    [select_type] => SIMPLE
    [table] => imagelinks
    [type] => ref
    [possible_keys] => il_to,il_backlinks_namespace
    [key] => il_backlinks_namespace
    [key_len] => 257
    [ref] => const
    [rows] => 97460
    [Extra] => Using where; Using index; Using filesort
)

Apparently something is screwy with how it guesses how many rows will be examined, because when I force the il_to index it's thinking it'll need a lot:

> explain SELECT * FROM imagelinks USE INDEX (il_to) WHERE il_to = 'Wiktionary-logo-en.svg'  ORDER BY il_from LIMIT 6;
stdClass Object
(
    [id] => 1
    [select_type] => SIMPLE
    [table] => imagelinks
    [type] => ref
    [possible_keys] => il_to
    [key] => il_to
    [key_len] => 257
    [ref] => const
    [rows] => 114932
    [Extra] => Using index condition; Using where
)

While the difference in actual execution time doesn't seem very different for any of these simple queries, even the slightly-more-complex query coming out of ApiQueryBacklinks is noticeably slower with the wrong index choice:

SELECT /* ApiQueryBacklinks::run::firstQuery Anomie */  /*! STRAIGHT_JOIN */ page_namespace,page_title,page_id,page_is_redirect  FROM `imagelinks`,`page`   WHERE (il_from=page_id) AND il_to = 'Wiktionary-logo-en.svg' AND page_is_redirect = '1'  ORDER BY il_from LIMIT 6;

14 seconds when il_to is forced versus over 2 minutes (I gave up waiting) when it's allowed to use il_from.

Event Timeline

Anomie raised the priority of this task from to Needs Triage.
Anomie updated the task description. (Show Details)
Anomie changed Security from none to None.
Anomie added subscribers: Anomie, Springle.

Using a MariaDB 10 enwiki slave EXPLAIN reports a different plan to SHOW EXPLAIN:

mysql> explain SELECT /* ApiQueryBacklinks::run::firstQuery Anomie */  /*! STRAIGHT_JOIN */ page_namespace,page_title,page_id,page_is_redirect  FROM `imagelinks`,`page`   WHERE (il_from=page_id) AND il_to = 'Wiktionary-logo-en.svg' AND page_is_redirect = '1'  ORDER BY il_from LIMIT 6\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: imagelinks
         type: index
possible_keys: il_from,il_to,il_backlinks_namespace
          key: il_from
      key_len: 261
          ref: NULL
         rows: 4074
        Extra: Using where; Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: eq_ref
possible_keys: PRIMARY,page_redirect_namespace_len
          key: PRIMARY
      key_len: 4
          ref: enwiki.imagelinks.il_from
         rows: 1
        Extra: Using where

<execute query in another thread>

mysql> SHOW EXPLAIN FOR 6\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: imagelinks
         type: ref
possible_keys: il_from,il_to,il_backlinks_namespace
          key: il_backlinks_namespace                          <== meh
      key_len: 257
          ref: const
         rows: 99518
        Extra: Using where; Using index; Using filesort        <== meh
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: page
         type: eq_ref
possible_keys: PRIMARY,page_redirect_namespace_len
          key: PRIMARY
      key_len: 4
          ref: enwiki.imagelinks.il_from
         rows: 1
        Extra: Using where

Don't know if this is happening on 5.5 because SHOW EXPLAIN is new.

The *_backlinks_namespace indexes are causing issues in some other places too, like pagelinks. Possible that this should be il_backlinks_namespace (il_from_namespace, il_to, il_from) instead.