Page MenuHomePhabricator

MessageCache::loadFromDB makes too many slow queries with wrong index
Open, Needs TriagePublic

Description

MediaWiki 1.28.2 (664b1b4) and 1.29.1, HHVM 3.21.0 (srv), MariaDB 10.2.8-MariaDB-10.2.8+maria~zesty-log, ICU 57.1, LuaSandbox 2.0.13-alpha, Lua 5.1.5.

My SQL server spends much of its time on queries like this (and they are usually slow):

SELECT /* MessageCache::loadFromDB(ru)-small  */  page_title,old_text,old_flags  FROM `page`,`revision`,`text`    WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title NOT LIKE '%/%' ) AND (page_latest=rev_id) AND (rev_text_id=old_id) AND (page_len <= 131072 /* This is $wgMaxMsgCacheEntrySize */)

The language is sometimes (ru) and sometimes (en).

It seems to me that the message cache keeps being bypassed.

The relevant settings are:

$wgMaxMsgCacheEntrySize = 131072;
$wgLocalisationCacheConf = array (
	'class'			=> 'LocalisationCache',
	'store'			=> 'detect',
	'storeClass'	=> false,
	'manualRecache'	=> true,
);
$wgMessageCacheType = CACHE_ACCEL;	// -- APC is on and has enough space
$wgUseLocalMessageCache = true;
$wgCacheDirectory  = '/var/cache/mediawiki/(project name)'; // -- just in case

Am I missing anything or there may be a bug?

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 30 2017, 12:57 PM
alex-mashin renamed this task from Too many queries for messages to Too many queries for messages not using index.Sep 11 2017, 12:49 PM
alex-mashin added a project: Wikimedia-Rdbms.
alex-mashin added a comment.EditedSep 11 2017, 1:11 PM

Even worse: these queries use a wrong index on page, at least under MariaDB 10.2.8!

SELECT SQL_NO_CACHE /* MessageCache::loadFromDB(en)-small  */  page_title,old_id,old_text,old_flags  FROM `page`,`revision`,`text`    WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title LIKE '%/en' ESCAPE '`' ) AND (page_latest=rev_id) AND (rev_text_id=old_id) AND (page_len <= 131072);
...
1 row in set (2.41 sec)

But if I force the index page_redirect_namespace_len:

SELECT SQL_NO_CACHE /* MessageCache::loadFromDB(en)-small  */  page_title,old_id,old_text,old_flags  FROM `page` USE INDEX (page_redirect_namespace_len),`revision`,`text`    WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title LIKE '%/en' ESCAPE '`' ) AND (page_latest=rev_id) AND (rev_text_id=old_id) AND (page_len <= 131072);
...
1 row in set (0.02 sec)

One hundred times faster.
Query plan with the wrong index:

EXPLAIN SELECT /* MessageCache::loadFromDB(en)-small  */  page_title,old_id,old_text,old_flags  FROM `page`,`revision`,`text`    WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title LIKE '%/en' ESCAPE '`' ) AND (page_latest=rev_id) AND (rev_text_id=old_id) AND (page_len <= 131072);
+------+-------------+----------+--------+-------------------------------------------------+----------+---------+-----------------------------------+------+-------------+
| id   | select_type | table    | type   | possible_keys                                   | key      | key_len | ref                               | rows | Extra       |
+------+-------------+----------+--------+-------------------------------------------------+----------+---------+-----------------------------------+------+-------------+
|    1 | SIMPLE      | page     | range  | name_title,page_len,page_redirect_namespace_len | page_len | 4       | NULL                              |   81 | Using where |
|    1 | SIMPLE      | revision | eq_ref | rev_id                                          | rev_id   | 4       | (my wiki).page.page_latest     |    1 |             |
|    1 | SIMPLE      | text     | eq_ref | PRIMARY                                         | PRIMARY  | 4       | (my wiki).revision.rev_text_id |    1 |             |
+------+-------------+----------+--------+-------------------------------------------------+----------+---------+-----------------------------------+------+-------------+
3 rows in set (0.01 sec)

With the right index:

EXPLAIN SELECT /* MessageCache::loadFromDB(en)-small  */  page_title,old_id,old_text,old_flags  FROM `page` USE INDEX (page_redirect_namespace_len),`revision`,`text`    WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title LIKE '%/en' ESCAPE '`' ) AND (page_latest=rev_id) AND (rev_text_id=old_id) AND (page_len <= 131072);
+------+-------------+----------+--------+-----------------------------+-----------------------------+---------+-----------------------------------+------+-------------+
| id   | select_type | table    | type   | possible_keys               | key                         | key_len | ref                               | rows | Extra       |
+------+-------------+----------+--------+-----------------------------+-----------------------------+---------+-----------------------------------+------+-------------+
|    1 | SIMPLE      | page     | range  | page_redirect_namespace_len | page_redirect_namespace_len | 9       | NULL                              |  493 | Using where |
|    1 | SIMPLE      | revision | eq_ref | rev_id                      | rev_id                      | 4       | (my wiki).page.page_latest     |    1 |             |
|    1 | SIMPLE      | text     | eq_ref | PRIMARY                     | PRIMARY                     | 4       | (my wiki).revision.rev_text_id |    1 |             |
+------+-------------+----------+--------+-----------------------------+-----------------------------+---------+-----------------------------------+------+-------------+
3 rows in set (0.00 sec)

Note the different indices on page.
If I add the following in the line 521 of includes/cache/MessageCache.php:

['USE INDEX' => ['page' => 'page_redirect_namespace_len']]

, my nearly-dead wiki immediately comes back to life..

alex-mashin renamed this task from Too many queries for messages not using index to Too many slow queries for messages with wrong index.Sep 11 2017, 1:11 PM
alex-mashin updated the task description. (Show Details)
alex-mashin added a comment.EditedSep 11 2017, 2:02 PM

UPD: I did OPTIMIZE TABLE page; and CHECK TABLE page; (it's a TokuDB table) and the query without the index hint started to use the right one. Don't know, if it was a broken index.

Still, the queries may not be necessarily slow, but are still too numerous.

Krinkle renamed this task from Too many slow queries for messages with wrong index to MessageCache::loadFromDB makes too many slow queries with wrong index.Jul 28 2018, 7:55 PM
Krinkle moved this task from Backlog to MessageCache on the MediaWiki-Cache board.
Krinkle moved this task from Untriaged to Usage problem on the Wikimedia-Rdbms board.
aaron added a subscriber: aaron.Oct 27 2018, 9:07 PM

This will be better with a3d6c1411dad3e057b if there are many message pages that exists for extension use.