Page MenuHomePhabricator

Querying the logging table on labs is slow
Open, LowPublic

Description

Try to run this query on enwp on labs:

select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';

This should be using the indexes, as indicated by the below, but it takes much, much longer to run on labs, any reason why?
Using the logging_userindex table did not help.

mysql:wikiadmin@db1057 [enwiki]> select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+----------------+------------+
| log_timestamp  | log_action |
+----------------+------------+
| 20160121070000 | delete     |
+----------------+------------+
1 row in set (0.08 sec)

mysql:wikiadmin@db1057 [enwiki]> explain select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+------+-------------+---------+-------+--------------------------------------------------------------+---------------------+---------+------+------+------------------------------------+
| id   | select_type | table   | type  | possible_keys                                                | key                 | key_len | ref  | rows | Extra                              |
+------+-------------+---------+-------+--------------------------------------------------------------+---------------------+---------+------+------+------------------------------------+
|    1 | SIMPLE      | logging | range | type_time,page_time,times,log_title_time,log_title_type_time | log_title_type_time | 68      | NULL |    1 | Using index condition; Using where |
+------+-------------+---------+-------+--------------------------------------------------------------+---------------------+---------+------+------+------------------------------------+
1 row in set (0.01 sec)

Event Timeline

Sigma created this task.Mar 30 2016, 6:11 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 30 2016, 6:11 PM
Sigma added a comment.EditedMar 30 2016, 6:14 PM

this is probably relevant:

mysql:wikiadmin@db1072 [enwiki]> show index from logging;
+---------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name            | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| logging |          0 | PRIMARY             |            1 | log_id        | A         |    58882601 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | type_time           |            1 | log_type      | A         |          33 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | type_time           |            2 | log_timestamp | A         |    55020184 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | user_time           |            1 | log_user      | A         |    21696673 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | user_time           |            2 | log_timestamp | A         |    56293117 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | page_time           |            1 | log_namespace | A         |          33 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | page_time           |            2 | log_title     | A         |    46005626 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | page_time           |            3 | log_timestamp | A         |    58531412 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | times               |            1 | log_timestamp | A         |    50883685 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_page_id_time    |            1 | log_page      | A         |    17029905 |     NULL | NULL   | YES  | BTREE      |         |               |
| logging |          1 | log_page_id_time    |            2 | log_timestamp | A         |    54465452 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_user_text_time  |            1 | log_user_text | A         |    58882601 |       16 | NULL   |      | BTREE      |         |               |
| logging |          1 | log_user_text_time  |            2 | log_timestamp | A         |    58882601 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_title_time      |            1 | log_title     | A         |    58882601 |       16 | NULL   |      | BTREE      |         |               |
| logging |          1 | log_title_time      |            2 | log_timestamp | A         |    58882601 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_user_type_time  |            1 | log_user      | A         |    21696673 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_user_type_time  |            2 | log_type      | A         |    22461415 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_user_type_time  |            3 | log_timestamp | A         |    56579803 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_title_type_time |            1 | log_title     | A         |    58882601 |       16 | NULL   |      | BTREE      |         |               |
| logging |          1 | log_title_type_time |            2 | log_type      | A         |    58882601 |     NULL | NULL   |      | BTREE      |         |               |
| logging |          1 | log_title_type_time |            3 | log_timestamp | A         |    58882601 |     NULL | NULL   |      | BTREE      |         |               |
+---------+------------+---------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
21 rows in set (0.00 sec)
chasemp triaged this task as Medium priority.Mar 30 2016, 6:14 PM
chasemp added projects: Cloud-Services, DBA.
Sigma updated the task description. (Show Details)Mar 30 2016, 6:15 PM
Sigma updated the task description. (Show Details)
Reedy added a subscriber: Reedy.Mar 30 2016, 6:16 PM

EXPLAIN and see what indexes that actual query is using?

Enwiki tables are being loaded currently to a) correct incorrect data b) populate the new server. During the imports (one table at a time), blocks and missing data can happen.

Sigma added a comment.Mar 30 2016, 6:17 PM

Details of the query in question:

MariaDB [enwiki_p]> select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+----------------+------------+
| log_timestamp  | log_action |
+----------------+------------+
| 20160121070000 | delete     |
+----------------+------------+
1 row in set (16.10 sec)
Sigma added a subscriber: Legoktm.Mar 30 2016, 6:20 PM

@Reedy I have included the explains that @Legoktm ran for me in the description.

Reedy renamed this task from The logging table on labs is broken to Querying the logging table on labs is slow.EditedMar 30 2016, 6:36 PM
Reedy lowered the priority of this task from Medium to Low.

Enwiki tables are being loaded currently to a) correct incorrect data b) populate the new server. During the imports (one table at a time), blocks and missing data can happen.

Based on the fact it's using indexes, it's just slow queries, potentially missing data, with ongoing maintenance, as above, I've retitled and lowered the priority

@jcrespo Do we have a bug we can block this against?

The maintenance is to correct several issues: T126946 T115517 T129432. The reimports are done in small chunks so it only affects a subset of the queries each time.

Sigma added a comment.Mar 31 2016, 3:56 PM

Is there an ETA for their completion? I notice that some of those listed tasks are months old.

Log table for enwiki has very recently finished importing. Do you still have issues?

Sigma added a comment.Mar 31 2016, 4:10 PM

MariaDB [enwiki_p]> select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+----------------+------------+

log_timestamplog_action

+----------------+------------+

20160121070000delete

+----------------+------------+
1 row in set (12.60 sec)

MariaDB [enwiki_p]> select log_timestamp, log_action from logging_userindex where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+----------------+------------+

log_timestamplog_action

+----------------+------------+

20160121070000delete

+----------------+------------+
1 row in set (12.02 sec)

jcrespo raised the priority of this task from Low to Medium.Mar 31 2016, 4:15 PM

So this is not directly caused by the import (it could still be a case of analytics being wrong due to the reimport). Let me play around a bit with the index stats.

jcrespo moved this task from Triage to In progress on the DBA board.Mar 31 2016, 4:16 PM
jcrespo lowered the priority of this task from Medium to Low.Mar 31 2016, 4:43 PM

This is not an issue with analytics either, it is due to the filtering. Querying the same table in production and labs has the same performance, it is when using the view that it is slow.

It would be nice if someone with proper time could give a patch for a different query or for a different view creation (without compromising private data) that would be faster to query (it is a very query-dependent issue rather than an infrastructure issue).

MariaDB PRODUCTION s1 localhost enwiki > select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+----------------+------------+
| log_timestamp  | log_action |
+----------------+------------+
| 20160121070000 | delete     |
+----------------+------------+
1 row in set (0.09 sec)

MariaDB LABS localhost enwiki > select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+----------------+------------+
| log_timestamp  | log_action |
+----------------+------------+
| 20160121070000 | delete     |
+----------------+------------+
1 row in set (0.13 sec)

MariaDB LABS localhost enwiki > use enwiki_p
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A


MariaDB LABS localhost enwiki_p > select log_timestamp, log_action from logging where log_namespace=0 and log_title='Great_Nuclear_Doge_Event' and log_type='delete' and log_timestamp > '20160121065733';
+----------------+------------+
| log_timestamp  | log_action |
+----------------+------------+
| 20160121070000 | delete     |
+----------------+------------+
1 row in set (8.78 sec)
Sigma added a comment.Apr 1 2016, 3:36 AM

So the view query isn't indexed or what?

I don't actually know how Labs works so that probably doesn't make sense, but I'm interested to know why everything is slow.

MySQL views cannot have indexes (they are not materialized), the underlying table have. However, the views also perform sanitization of private fields, which requires extra filtering and may impact on the actual query sent. Usually that is not a problem, except in certain cases. You can see the view definition with SHOW VIEW and see if there is a workaround. Sadly views make query optimization more difficult, but not impossible.

Sigma added a comment.Apr 11 2016, 7:15 PM

Hi @jcrespo

Thank you for your response.

I've looked over the view definition. I think this slowness could be circumvented by creating a new index log_type_deleted_title_time on (log_type, log_deleted, log_title, log_timestamp). The view is using log_deleted and log_type, so an index that starts with those two columns seems prudent. Perhaps it could be in a different order, if such would be more performant.

Let me know if this is possible.

Thanks,
Sigma

Volans added a subscriber: Volans.Apr 11 2016, 7:40 PM

@Sigma: FYI jcrespo is on vacation, he will be back at the end of the week.

Sigma added a comment.Apr 16 2016, 4:39 AM

Sounds reasonable. Perhaps he'd be able to take a look today.

I will try it on a single table before applying to to al wikis in a few days. However, next week is reserved for important maintenance, so please I would like to request patience from you.

Composite indices on (namespace, title) are quite common in MediaWiki's database schema. SELECT queries run on Wikimedia Labs from archive, archive_userindex, recentchanges, recentchanges_userindex, templatelinks, and page that specify an exact namespace and title return almost instantly.

Something is funky with the views for logging and logging_userindex specifically, it seems.

Any updates?

I will try it on a single table before applying to to al wikis in a few days. However, next week is reserved for important maintenance, so please I would like to request patience from you.

Did you end up trying this, @jcrespo?

jcrespo added a comment.EditedNov 15 2016, 2:53 PM

No, this is low priority and there is worse issues on labs going on right now that I am fixing, so I consider this a feature request rather than an ongoing issue (some queries are slow). I also classified it on "Help Welcome", meaning that people can provide tests and patches on their own to help speed up the process. I strongly believe it is possible to get an acceptable performance just by changing the query parameters.

Then there is the practical issues- creating an index with the current infrastructure on labs can take a full day, on which the server is lagged and causes annoyances on users; plus there is not a method in place to maintain schema drifts between production and labs other than removing private information.

Something is funky with the views for logging and logging_userindex specifically, it seems.

I rediscovered this issue over the weekend.

MariaDB [enwiki_p]>     SELECT
    ->       log_timestamp,
    ->       log_comment
    ->     FROM logging_userindex
    ->     WHERE log_namespace = 14
    ->     AND log_title = 'Handheld_PC'
    ->     LIMIT 1;
Empty set (29 min 59.64 sec)
MariaDB [enwiki_p]>     SELECT
    ->       log_timestamp,
    ->       log_comment
    ->     FROM logging          
    ->     WHERE log_namespace = 14
    ->     AND log_title = 'Handheld_PC'
    ->     LIMIT 1;
Empty set (28 min 45.00 sec)

I was trying to get a database report updated.

The task description makes me think that the type_time index is being preferred over the page_time index for some reason.

From https://github.com/wikimedia/mediawiki/blob/638becad/maintenance/tables.sql#L1295:

CREATE INDEX /*i*/type_time ON /*_*/logging (log_type, log_timestamp);
CREATE INDEX /*i*/page_time ON /*_*/logging (log_namespace, log_title, log_timestamp);

I don't believe views can force an index. Bleh.

MZMcBride added a subscriber: scfc.Dec 4 2016, 11:12 PM

I tried the trick mentioned by @scfc in T50875#2845764.

MariaDB [enwiki_p]> select log_timestamp, log_comment from logging_userindex where log_namespace = 14 and log_title = 'Handheld_PC' limit 1;
MariaDB [enwiki_p]> show processlist\G
*************************** 1. row ***************************
      Id: 35595091
    User: u2671
    Host: 10.68.23.58:53428
      db: enwiki_p
 Command: Query
    Time: 6
   State: Sending data
    Info: select log_timestamp, log_comment from logging_userindex where log_namespace = 14 and log_title = 'H
Progress: 0.000
*************************** 2. row ***************************
      Id: 35595635
    User: u2671
    Host: 10.68.23.58:53480
      db: enwiki_p
 Command: Query
    Time: 0
   State: init
    Info: show processlist
Progress: 0.000
2 rows in set (0.00 sec)
MariaDB [enwiki_p]> show explain for 35595091;
+------+-------------+---------+-------+---------------+-----------+---------+------+----------+---------------------------------+
| id   | select_type | table   | type  | possible_keys | key       | key_len | ref  | rows     | Extra                           |
+------+-------------+---------+-------+---------------+-----------+---------+------+----------+---------------------------------+
|    1 | SIMPLE      | logging | range | type_time     | type_time | 34      | NULL | 40059284 | Using where; Rowid-ordered scan |
+------+-------------+---------+-------+---------------+-----------+---------+------+----------+---------------------------------+
1 row in set, 1 warning (0.00 sec)

This is probably irrelevant, I'm just including it for completeness:

MariaDB [enwiki_p]> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                 |
+-------+------+-------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | select log_timestamp, log_comment from logging_userindex where log_namespace = 14 and log_title = 'Handheld_PC' limit 1 |
+-------+------+-------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

The "show explain" output seems to confirm that the query planner is selecting the wrong index (type_time). Is there a way to change the behavior of the query planner to use the correct index (in this case, page_time) when running select statements on the logging_userindex table/view? I don't think re-imports or new servers are going to fix this issue.

Is there a way to change the behavior of the query planner to use the correct index (in this case, page_time) when running select statements on the logging_userindex table/view?

There would be one, which is forcing the index on the view creation, but that could have unintended consequences. Needs some proof of concept/check that fixing that does not create a regression on other users. But it completely possible.