Page MenuHomePhabricator

Special:Log on commons -- entire web request took longer than 60 seconds and timed out
Closed, ResolvedPublicPRODUCTION ERROR

Description

https://commons.wikimedia.org/wiki/Special:Log errors with "PHP fatal error: entire web request took longer than 60 seconds and timed out"

Event Timeline

CDanis triaged this task as High priority.Apr 19 2019, 2:48 PM

Looks like database being slow? Pretty sure this is a MW API call backing the pageload of Special:Log on commonswiki.

#0 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/DatabaseMysqli.php(46): mysqli::query()
#1 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/Database.php(1322): Wikimedia\Rdbms\DatabaseMysqli->doQuery(string)
#2 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/Database.php(1238): Wikimedia\Rdbms\Database->attemptQuery(string, string, boolean, string)
#3 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/Database.php(1784): Wikimedia\Rdbms\Database->query(string, string)
#4 /srv/mediawiki/php-1.34.0-wmf.1/includes/api/ApiQueryBase.php(400): Wikimedia\Rdbms\Database->select(array, array, array, string, array, array)
#5 /srv/mediawiki/php-1.34.0-wmf.1/includes/api/ApiQueryLogEvents.php(242): ApiQueryBase->select(string)
#6 /srv/mediawiki/php-1.34.0-wmf.1/includes/api/ApiQuery.php(249): ApiQueryLogEvents->execute()
#7 /srv/mediawiki/php-1.34.0-wmf.1/includes/api/ApiMain.php(1593): ApiQuery->execute()
#8 /srv/mediawiki/php-1.34.0-wmf.1/includes/api/ApiMain.php(531): ApiMain->executeAction()
#9 /srv/mediawiki/php-1.34.0-wmf.1/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#10 /srv/mediawiki/php-1.34.0-wmf.1/api.php(87): ApiMain->execute()
#11 /srv/mediawiki/w/api.php(3): include(string)
#12 {main}

https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-2019.04.19/mediawiki?id=AWo2EgrP3aPtkd7PgRkl&_g=()

Framawiki raised the priority of this task from High to Unbreak Now!.Apr 20 2019, 9:41 AM

This is the query I believe - looks like the optimizer is being dumb again:

SELECT /* IndexPager::buildQueryInfo (LogPager)  */  log_id,log_type,log_action,log_timestamp,log_namespace,log_title,log_params,log_deleted,user_id,user_name,user_editcount,comment_log_comment.comment_text AS `log_comment_text`,comment_log_comment.comment_data AS `log_comment_data`,comment_log_comment.comment_id AS `log_comment_cid`,actor_log_user.actor_user AS `log_user`,actor_log_user.actor_name AS `log_user_text`,log_actor,(SELECT  GROUP_CONCAT(ctd_name SEPARATOR ',')  FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id))   WHERE ct_log_id=log_id  ) AS `ts_tags`  FROM `logging` JOIN `comment` `comment_log_comment` ON ((comment_log_comment.comment_id = log_comment_id)) JOIN `actor` `actor_log_user` ON ((actor_log_user.actor_id = log_actor)) LEFT JOIN `user` ON ((user_id=actor_log_user.actor_user))   WHERE (log_type NOT IN ('spamblacklist','titleblacklist','abusefilterprivatedetails','oath','suppress')) AND (log_type != 'thanks') AND (log_type != 'patrol') AND (log_type != 'tag')  ORDER BY log_timestamp DESC LIMIT 51


+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+---------------------------------------+---------+---------------------------------+
| id   | select_type        | table               | type   | possible_keys                                            | key                   | key_len | ref                                   | rows    | Extra                           |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+---------------------------------------+---------+---------------------------------+
|    1 | PRIMARY            | actor_log_user      | ALL    | PRIMARY                                                  | NULL                  | NULL    | NULL                                  | 7989573 | Using temporary; Using filesort |
|    1 | PRIMARY            | user                | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | commonswiki.actor_log_user.actor_user |       1 | Using where                     |
|    1 | PRIMARY            | logging             | ref    | type_time,actor_time,log_actor_type_time,log_type_action | actor_time            | 8       | commonswiki.actor_log_user.actor_id   |       2 | Using where                     |
|    1 | PRIMARY            | comment_log_comment | eq_ref | PRIMARY                                                  | PRIMARY               | 8       | commonswiki.logging.log_comment_id    |       1 |                                 |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_log_tag_id,change_tag_tag_id_id               | change_tag_log_tag_id | 5       | commonswiki.logging.log_id            |       1 | Using index                     |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                  | PRIMARY               | 4       | commonswiki.change_tag.ct_tag_id      |       1 |                                 |
+------+--------------------+---------------------+--------+----------------------------------------------------------+-----------------------+---------+---------------------------------------+---------+---------------------------------+
6 rows in set, 1 warning (0.00 sec)

This query is being killed by the query killer as it takes quite long (I killed the query after having it running for more than 10 minutes without finishing):

1074873991	root	10.64.32.25:34586	commonswiki	Query	793	Sending data	SELECT   log_id,log_type,log_action,log_timestamp,log_namespace,log_title,log_params,log_deleted,use	0.000

@Framawiki It sounds like the same issue but I'm not totally sure.

Should T221380 be merged or made a subtask? It seems like the same issue, but I wouldn't be able to know.

Let's keep them separated for now, my bet is they are the same underlying issue, but the effects (Special:Log vs Special:Contributions), are different and they may need different resolutions (e.g. different query hints or indexes).

Change 504391 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] API: Add STRAIGHT_JOIN to ApiQueryLogEvents and LogPager to avoid planner oddness

https://gerrit.wikimedia.org/r/504391

This is now waiting on reviewers for the patchset at T221458#5130519
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/504391
@aaron @Reedy @Ladsgroup @Krinkle @tstarling maybe can help taking a look and see if it can be merged?

Change 504391 merged by jenkins-bot:
[mediawiki/core@master] Add STRAIGHT_JOIN to ApiQueryLogEvents and LogPager to avoid planner oddness

https://gerrit.wikimedia.org/r/504391

EvanProdromou subscribed.

So, it looks like this ticket is done, and we're just waiting for it to go out with the train?

WDoranWMF claimed this task.
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:07 PM