In T121306#1947350, @Josve05a wrote:https://commons.wikimedia.org/wiki/Special:NewFiles?hidepatrolled=1&limit=50&offset=
A database query error has occurred. This may indicate a bug in the software. Function: IndexPager::buildQueryInfo (NewFilesPager) Error: 2013 Lost connection to MySQL server during query (10.64.32.26)
Description
Description
Details
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Special:NewFiles: Make 'hidepatrolled' query less slow | mediawiki/core | master | +8 -1 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T134802 Improve the curator workflow for reviewing new files | |||
Open | None | T120453 Copyright violation detection tool for Commons | |||
Resolved | None | T121870 Improve Special:NewFiles | |||
Resolved | matthiasmullie | T121865 Allow the same filtering on Special:NewFiles as on Special:NewPages (by tags, username) | |||
Resolved | matmarex | T124205 Error: 2013 Lost connection to MySQL server during query on NewFilesPager | |||
Open | None | T141765 Error: 2013 Lost connection to MySQL server during query on NewFilesPager (again) | |||
Duplicate | Feature | None | T31793 Check uploaded images with Google image search to find copyright violations | ||
Duplicate | None | T123517 Automatically check Commons uploads for possible copyright violations | |||
Duplicate | None | T230561 Create a ML model to score new files in commons for copyvio issues | |||
Resolved | Samwilson | T145165 Investigation: Copyvio tools for Commons | |||
Open | None | T132650 Copyright detection (acoustic fingerprint matching) for audio files |
Event Timeline
Comment Actions
This is new from rMWc53f647658ad: Allow patrol of uploads and only occurs when "Hide patrolled uploads" is checked. I've just noticed that the page can be pretty slow and sometimes times out, as it happened here.
The query is:
SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user)) INNER JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (rc_timestamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0' ORDER BY img_timestamp DESC LIMIT 51
Explain:
+------+-------------+---------------+--------+-----------------------+---------+---------+-----------------------------------------+---------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------+--------+-----------------------+---------+---------+-----------------------------------------+---------+----------------------------------------------+ | 1 | SIMPLE | recentchanges | ALL | rc_timestamp | NULL | NULL | NULL | 3980662 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | image | eq_ref | PRIMARY,img_timestamp | PRIMARY | 257 | commonswiki.recentchanges.rc_title | 1 | Using where | | 1 | SIMPLE | user_groups | eq_ref | PRIMARY,ug_group | PRIMARY | 261 | commonswiki.recentchanges.rc_user,const | 1 | Using where; Using index; Not exists | +------+-------------+---------------+--------+-----------------------+---------+---------+-----------------------------------------+---------+----------------------------------------------+
Comment Actions
I fiddled with it a bit, and apparently just changing the ORDER BY field to rc_timestamp makes it magically fast. The values of that and img_timestamp are the same.
SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user)) INNER JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (rc_timestamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0' ORDER BY rc_timestamp DESC LIMIT 51
+------+-------------+---------------+--------+-----------------------+--------------+---------+-----------------------------------------+------+--------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------+--------+-----------------------+--------------+---------+-----------------------------------------+------+--------------------------------------+ | 1 | SIMPLE | recentchanges | index | rc_timestamp | rc_timestamp | 16 | NULL | 51 | Using where | | 1 | SIMPLE | image | eq_ref | PRIMARY,img_timestamp | PRIMARY | 257 | commonswiki.recentchanges.rc_title | 1 | Using where | | 1 | SIMPLE | user_groups | eq_ref | PRIMARY,ug_group | PRIMARY | 261 | commonswiki.recentchanges.rc_user,const | 1 | Using where; Using index; Not exists | +------+-------------+---------------+--------+-----------------------+--------------+---------+-----------------------------------------+------+--------------------------------------+
Also, since we're using rc_title in the query, we should probably specify rc_namespace too (which is going to be 6, NS_FILE).
I'm pretty sure that query shouldn't even need to query the image table, just recentchanges…
Comment Actions
Change 269356 had a related patch set uploaded (by Bartosz Dziewoński):
Special:NewFiles: Make 'hidepatrolled' query less slow
Comment Actions
Thanks for this. This changes are tricky because this kind of special pages have lots of filters and orders.
Joining to image makes sure that there is an actual image uploaded. I wonder if there is would be cases where something it is in the uploaded log but not in the image (e.g. image has been deleted). I agree that if that wasn't a problem, dropping image in this specific case and using the namespace would be wise. But we should do an extensive testing of both queries, on serveral wikis and several servers.
Comment Actions
Sadly, I have found several instances where the new query is actually slower than the original one:
Original:
51 rows in set (0.36 sec)
New:
51 rows in set (2 min 14.38 sec)
MariaDB DBSTORE localhost enwiki > EXPLAIN SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user)) INNER JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (rc_timestamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0' ORDER BY rc_timestamp DESC LIMIT 51; +------+-------------+---------------+--------+-----------------------+--------------+---------+-----------------------------+--------+-------------- | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra +------+-------------+---------------+--------+-----------------------+--------------+---------+-----------------------------+--------+-------------- | 1 | SIMPLE | image | ALL | PRIMARY,img_timestamp | NULL | NULL | NULL | 911549 | Using tempora | 1 | SIMPLE | user_groups | eq_ref | PRIMARY,ug_group | PRIMARY | 261 | enwiki.image.img_user,const | 1 | Using where; | 1 | SIMPLE | recentchanges | ref | rc_timestamp | rc_timestamp | 16 | enwiki.image.img_timestamp | 2 | Using where +------+-------------+---------------+--------+-----------------------+--------------+---------+-----------------------------+--------+-------------- 3 rows in set (0.14 sec) MariaDB DBSTORE localhost enwiki > EXPLAIN SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user)) INNER JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (rc_timestamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0' ORDER BY img_timestamp DESC LIMIT 51; +------+-------------+---------------+--------+-----------------------+---------------+---------+-----------------------------+------+--------------- | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra +------+-------------+---------------+--------+-----------------------+---------------+---------+-----------------------------+------+--------------- | 1 | SIMPLE | image | index | PRIMARY,img_timestamp | img_timestamp | 18 | NULL | 25 | | 1 | SIMPLE | user_groups | eq_ref | PRIMARY,ug_group | PRIMARY | 261 | enwiki.image.img_user,const | 1 | Using where; U | 1 | SIMPLE | recentchanges | ref | rc_timestamp | rc_timestamp | 16 | enwiki.image.img_timestamp | 2 | Using where +------+-------------+---------------+--------+-----------------------+---------------+---------+-----------------------------+------+--------------- 3 rows in set (0.00 sec)
Comment Actions
The query worked fine and was fast up until now. What could explain that this no longer works ?
Maybe we should try that.
Comment Actions
The query worked fine and was fast up until now. What could explain that this no longer works ?
The number 1 reason usually is "the data changed". The optimizer bases its queries on the index and internal heuristical statistics. If the data grows considerably, those numbers can change. That is why sometimes a query works well on the English Wikipedia but not on smaller wikis, and viceversa, even on the same code. See, for example, that in a couple of tests on production, I am still getting the old (good) behavior.
I am also regularly upgrading/restarting the databases, which can affect slightly the optimizer decisions and the generation of those stats. Usually very small changes, but that can be very visible in a high-thoughput environment.
Comment Actions
My EXPLAINs were done on commonswiki… are you saying the tables in the two databases (commonswiki and enwiki) have different indexes? Or is MySQL just deciding to execute them differently because the data is different?
Comment Actions
These are the differences, some already tracked and pending to be fixed, some maintenance that is low priority:
$ mysql -h s4-master commonswiki -e "SHOW CREATE TABLE user_groups\G SHOW CREATE TABLE recentchanges\G SHOW CREATE TABLE image\G" > ~/commons.sql $ mysql -h s1-master enwiki -e "SHOW CREATE TABLE user_groups\G SHOW CREATE TABLE recentchanges\G SHOW CREATE TABLE image\G" > ~/enwiki.sql $ diff ~/commons.sql ~/enwiki.sql 38c38 < `rc_params` blob, --- > `rc_params` blob NOT NULL, 46a47 > KEY `tmp_1` (`rc_this_oldid`), 49c50 < ) ENGINE=InnoDB AUTO_INCREMENT=195466880 DEFAULT CHARSET=binary --- > ) ENGINE=InnoDB AUTO_INCREMENT=795613037 DEFAULT CHARSET=binary 61c62 < `img_minor_mime` varbinary(32) NOT NULL DEFAULT 'unknown', --- > `img_minor_mime` varbinary(100) NOT NULL DEFAULT 'unknown',
So the answer is "is MySQL just deciding to execute them differently because the data is different?". In particular, the size of recentchanges vs. image.
Commons needing special treatment for images should not be surprising. Now the question is how to implement that in the cleanest way possible (data-dependent code). Until now there is some things already implemented in terms of server-side code rewrite (which I do not like, because it is "hidden") and slaves with special partitioning.
There is one last thing I would like to try before that, which is regenerating the table statistics.
Comment Actions
The query works fine at entwiki, I don't think it ever failed like at commons. A big difference in data is that at commons the ratio of autopatrolled uploads is much higher. At enwiki, most recent uploads are unpatrolled but at commons it's the opposite.
I've tried it again a few times at commons and it worked but was very slow.
Maybe a very large number of files were recently uploaded by one or several autopatrolled users, which badly impacted the query for unpatrolled files.
Comment Actions
After a lot of reading docs and StackOverflow…
Both queries can be made fast is MySQL is forced to first query the table from which the ORDER BY column is. (Note how both my commonswiki examples query recentchanges first and image later (slow with img_timestamp, fast with rc_timestamp), and both of your enwiki examples query image first and recentchanges later (fast with img_timestamp, slow with rc_timestamp).
So here's the original query with a STRAIGHT_JOIN, thanks to StackOverflow: http://stackoverflow.com/questions/3456134/is-there-a-way-to-force-mysql-execution-order
mysql:research@analytics-store.eqiad.wmnet [enwiki]> explain SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user)) STRAIGHT_JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (rc_timestamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patroll ed = '0' ORDER BY img_timestamp DESC LIMIT 51; +------+-------------+---------------+--------+-----------------------+---------------+---------+-----------------------------+------+--------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------+--------+-----------------------+---------------+---------+-----------------------------+------+--------------------------------------+ | 1 | SIMPLE | image | index | PRIMARY,img_timestamp | img_timestamp | 18 | NULL | 51 | | | 1 | SIMPLE | user_groups | eq_ref | PRIMARY,ug_group | PRIMARY | 261 | enwiki.image.img_user,const | 1 | Using where; Using index; Not exists | | 1 | SIMPLE | recentchanges | ref | rc_timestamp | rc_timestamp | 16 | enwiki.image.img_timestamp | 1 | Using where | +------+-------------+---------------+--------+-----------------------+---------------+---------+-----------------------------+------+--------------------------------------+ 3 rows in set (0.00 sec) mysql:research@analytics-store.eqiad.wmnet [commonswiki]> explain SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_ group = 'bot' AND (ug_user = img_user)) STRAIGHT_JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user ) AND (rc_timestamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_pa trolled = '0' ORDER BY img_timestamp DESC LIMIT 51; +------+-------------+---------------+--------+-----------------------+---------------+---------+----------------------------------+------+--------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+---------------+--------+-----------------------+---------------+---------+----------------------------------+------+--------------------------------------+ | 1 | SIMPLE | image | index | PRIMARY,img_timestamp | img_timestamp | 18 | NULL | 51 | | | 1 | SIMPLE | user_groups | eq_ref | PRIMARY,ug_group | PRIMARY | 261 | commonswiki.image.img_user,const | 1 | Using where; Using index; Not exists | | 1 | SIMPLE | recentchanges | ref | rc_timestamp | rc_timestamp | 16 | commonswiki.image.img_timestamp | 1 | Using where | +------+-------------+---------------+--------+-----------------------+---------------+---------+----------------------------------+------+--------------------------------------+ 3 rows in set (0.00 sec) mysql:research@analytics-store.eqiad.wmnet [enwiki]> SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user)) STRAIGHT_JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (rc_tim estamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0' ORDER BY img_timestamp DESC LIMIT 51; 51 rows in set (2.48 sec) mysql:research@analytics-store.eqiad.wmnet [commonswiki]> SELECT * FROM `image` LEFT JOIN `user_groups` ON (ug_group = 'bot' AND (ug_user = img_user)) STRAIGHT_JOIN `recentchanges` ON ((rc_title = img_name) AND (rc_user = img_user) AND (r c_timestamp = img_timestamp)) WHERE (ug_group IS NULL) AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0' ORDER BY img_timestamp DESC LIMIT 51; 51 rows in set (1.08 sec)
Comment Actions
STRAIGHT_JOIN like FORCE_INDEX should be avoided in general. While I approved doing that on those cases, it was because there was no other way. Please let me attempt what I mentioned before. See my comment on T113901:
"to see if we can avoid the STRAIGHT_JOIN (because it may bite us in the future, see T101502)."
If it doesn't work, it is probably the same bug as T113901 and we should just apply it, but we should track all of this with #optimizer-workaround or something similar here, and heavily comment the code referring to this bug.
Comment Actions
Regenerating stats does not work, this is probably https://mariadb.atlassian.net/browse/MDEV-8880 Go ahead with the STRAIGHT_JOIN.
Comment Actions
Right, I realize it kind of sucks and I have very limited knowledge of MySQL internals. I'm just trying to solve the immediate problem of code that I merged and that I'm supposed to support generating queries slow enough to routinely time out, and hoping someone like you will step in to turn that into something maintainable later ;) Can you merge the patch?
Comment Actions
Also, I think part of the problem might be using * in the select field. In the worst case, that can select up to 800 mb of data (img_metadata sucks)
Comment Actions
I left a comment on T11501 before reading this - I think maybe we should try to use the new_name_timestamp index - since we know that rc_new = 0 for log actions always, rc_namespace = 6 always, and then we're sorting by rc_timestamp
Comment Actions
Change 269356 merged by jenkins-bot:
Special:NewFiles: Make 'hidepatrolled' query less slow
Comment Actions
So alternatively, I think doing a query like:
SELECT /* IndexPager::buildQueryInfo (NewFilesPager) Bawolff */ img_name,img_user,img_timestamp FROM `recentchanges` FORCE INDEX( new_name_timestamp) INNER JOIN `image` ON ((rc_title = img_name) ) WHERE rc_bot = 0 AND rc_type = '3' AND rc_log_type = 'upload' AND rc_patrolled = '0' AND rc_namespace = '6' and rc_new = 0 ORDER BY rc_timestamp DESC LIMIT 51;
might be more optimal. But not 100% sure. (I mean, if the recently merged patch doesn't work out)