Page MenuHomePhabricator

DatabaseMysqlBase::fieldExists check can trigger full table scan
Closed, InvalidPublic

Description

The DatabaseMysqlBase::fieldExists method that checks if a given table has a given field currently issues a SELECT * FROM table LIMIT 1 query (via DatabaseMysqlBase::fieldInfo).

This is disadvantageous for multiple reasons:

  • if the table is empty, this check will not work
  • otherwise, it will trigger a full table scan.

Event Timeline

A potential alternative would be to use a DESCRIBE query here, kind of like this.

DatabaseBase went away a while ago...

Yeah, that's my bad - but the query is still there via DatabaseMysqlBase.

TK-999 renamed this task from DatabaseBase::fieldExists check can trigger full table scan to DatabaseMysqlBase::fieldExists check can trigger full table scan.Nov 9 2017, 7:22 PM
TK-999 updated the task description. (Show Details)

Change 390296 had a related patch set uploaded (by TK-999; owner: TK-999):
[mediawiki/core@master] T180157: Use a DESCRIBE query to determine if a field exists in a MySQL DB table

https://gerrit.wikimedia.org/r/390296

otherwise, it will trigger a full table scan.

This is not true. Your first argument is a good enough reason though

On a MySQL 5.7 install:

>explain select * from revision limit 1;
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
|  1 | SIMPLE      | revision | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 902737 |   100.00 | NULL  |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
1 row in set, 1 warning (0.00 sec)

Ordering by primary key would reduce the amount of rows scanned but that is not a generic option

On a MySQL 5.7 install:

>explain select * from revision limit 1;
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
|  1 | SIMPLE      | revision | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 902737 |   100.00 | NULL  |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
1 row in set, 1 warning (0.00 sec)

Ordering by primary key would reduce the amount of rows scanned but that is not a generic option

EXPLAIN lies all the time. I think that just means, mysql will do a full table scan up until it finds a row, and then immediately stop (so not really a full table scan).

An easy way to test this is to just do the query. If mysql responds in less than a second, that's a pretty good sign its not doing a full table scan over a million rows.

To be more scientific about it, do

show status like "Handler_read%";select * from revision limit 1;show status like "Handler_read%";

And take the difference from the before and after values of the handler status, which will show how many rows mysql actually read.

Yeah, the actual query indeed stops at the first row. Thanks for clarifying :)

Change 390296 abandoned by TK-999:
Use a DESCRIBE query to check field existence in MySQL DB table

https://gerrit.wikimedia.org/r/390296

Fixed link in task description to be a permalink.

Krinkle removed a project: Patch-For-Review.

Change 390296 restored by TK-999:
Use a DESCRIBE query to check field existence in MySQL DB table

https://gerrit.wikimedia.org/r/390296

As I noted on Gerrit, in 2010 MaxSem replaced DESCRIBE with SELECT for simplicity. SELECT does work on an empty table. Arguably LIMIT 0 would be better than LIMIT 1. But it works either way.

Change 390296 abandoned by TK-999:

[mediawiki/core@master] Use a DESCRIBE query to check field existence in MySQL DB table

Reason:

Per earlier discussion, this patch is not really useful, so abandoning it.

https://gerrit.wikimedia.org/r/390296