Page MenuHomePhabricator

SpecialMostRevisions::reallyDoQuery takes lots of hours to run on wikidata
Closed, ResolvedPublic

Description

The following query runs on the vslow host of wikidatawiki (which is good):

root@db1087.eqiad.wmnet[wikidatawiki]> explain SELECT /* SpecialMostRevisions::reallyDoQuery www-data@mwmain... */ page_namespace AS `namespace`,page_title AS `title`,COUNT(*) AS `value` FROM `revision`,`page` WHERE page_namespace IN ('640','146','0') AND (page_id = rev_page) AND (page_is_redirect = 0) GROUP BY page_namespace,page_title ORDER BY value DESC LIMIT 5000;
+------+-------------+----------+------+------------------------------------------------+-----------------------------+---------+---------------------------+----------+---------------------------------------------------------------------+
| id   | select_type | table    | type | possible_keys                                  | key                         | key_len | ref                       | rows     | Extra                                                               |
+------+-------------+----------+------+------------------------------------------------+-----------------------------+---------+---------------------------+----------+---------------------------------------------------------------------+
|    1 | SIMPLE      | page     | ref  | PRIMARY,name_title,page_redirect_namespace_len | page_redirect_namespace_len | 1       | const                     | 35629757 | Using index condition; Using where; Using temporary; Using filesort |
|    1 | SIMPLE      | revision | ref  | page_timestamp,page_user_timestamp,rev_page_id | rev_page_id                 | 4       | wikidatawiki.page.page_id |        8 | Using index                                                         |
+------+-------------+----------+------+------------------------------------------------+-----------------------------+---------+---------------------------+----------+---------------------------------------------------------------------+
2 rows in set (0.00 sec)

It has been running for 8h. It causes some spikes of lag: https://grafana.wikimedia.org/render/d-solo/000000273/mysql?panelId=6&orgId=1&var-dc=eqiad%20prometheus%2Fops&var-server=db1087&var-port=9104&from=1574654386713&to=1574675986713&width=1000&height=500&tz=Europe%2FMadrid

Which causes MW to complain (https://logstash.wikimedia.org/goto/64894c5953d862563733a4708e8f3da1):

 @timestamp	   	2019-11-25T10:00:36
t @version	   	1
t _id	   	AW6iAX06KWrIH1QR588P
t _index	   	logstash-mediawiki-2019.11.25
# _score	   	 - 
t _type	   	mediawiki
t channel	   	DBQuery
t db_name	   	 - 
t db_server	   	10.64.32.113
t db_user	   	wikiuser
t facility	   	user
t host	   	mw1338
t http_method	   	POST
t ip	   	10.64.32.50
t level	   	ERROR
t logsource	   	mw1338
t message	   	Timed out waiting for replication to reach 171970645-171970645-288070551,171978778-171978778-3298185533,171978924-171978924-759099546,180359242-180359242-170963125
t method	   	Wikimedia\Rdbms\DatabaseMysqlBase::masterPosWait
t mwversion	   	1.35.0-wmf.5
t normalized_message	   	Timed out waiting for replication to reach {raw_pos}
t phpversion	   	7.2.24-1+0~20191026.31+debian9~1.gbpbbacde+wmf1
t program	   	mediawiki
t raw_pos	   	[object] (Wikimedia\Rdbms\MySQLMasterPos: 171970645-171970645-288070551,171978778-171978778-3298185533,171978924-171978924-759099546,180359242-180359242-170963125)
t referrer	   	 - 
t reqId	   	6648b5800cd87944acdd8f7b
t server	   	jobrunner.discovery.wmnet
t severity	   	err
t shard	   	s8
t sql	   	SELECT MASTER_GTID_WAIT('171978924-171978924-759099546', 1)
t tags	   	input-kafka-rsyslog-udp-localhost, rsyslog-udp-localhost, kafka, es
t timeout	   	1
t timestamp	   	2019-11-25T10:00:36+00:00
t type	   	mediawiki
t url	   	/rpc/RunSingleJob.php
t wait_pos	   	171978924-171978924-759099546
t wiki	   	wikidatawiki

Details

Related Gerrit Patches:
operations/puppet : productionmediawiki: Disable mostrevisions for wikidata

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 25 2019, 10:03 AM
Anomie added a subscriber: Anomie.Nov 25 2019, 2:29 PM

This seems to be much the same thing as T235265: Slow query on SpecialMostLinked creating lag on wikidata slaves, BTW, just with a different special page.

We can stop it for now. @Lydia_Pintscher ?

Yes.

Change 553339 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/puppet@production] mediawiki: Disable mostrevisions for wikidata

https://gerrit.wikimedia.org/r/553339

Restricted Application added a project: User-Ladsgroup. · View Herald TranscriptNov 27 2019, 1:32 PM
Maintenance_bot moved this task from Incoming to In progress on the User-Ladsgroup board.

Change 553339 merged by Marostegui:
[operations/puppet@production] mediawiki: Disable mostrevisions for wikidata

https://gerrit.wikimedia.org/r/553339

Addshore closed this task as Resolved.Dec 3 2019, 1:58 PM
Addshore added a subscriber: Addshore.

We should probably turn this back at some point.
Probably once T208425 is done.

We should probably turn this back at some point.

Alternatively, does any more favorable day exist when to run this query?

I had to kill this query, as it was hitting db1087 hard again and make it lag

Marostegui reopened this task as Open.Mar 11 2020, 8:44 AM

Looks like the cron is still there:

# Puppet Name: cron-updatequerypages-mostrevisions-s8@18
0 1 11,25 * * /usr/local/bin/mwscriptwikiset updateSpecialPages.php s8.dblist --override --only=Mostrevisions > /var/log/mediawiki/updateSpecialPages/s8@18-MostRevisions.log 2>&1

Mentioned in SAL (#wikimedia-operations) [2020-03-11T10:59:46Z] <marostegui> Remove Mostrevisions from mwmaint1002 T239072

Marostegui closed this task as Resolved.Mar 11 2020, 11:01 AM

As spoken on IRC: I have removed it manually from mwmaint1002 and run puppet to check it doesn't get added back.