In roughly the same vein as T204606, T204871, and T206878, some actions on Wikidata, including granting user rights (such as would be possible using this link), viewing logs, and protecting pages (such as would be possible using this link), cannot be done at present because, in each case, an error page is returned which states that the "entire web request took longer than 60 seconds and timed out".
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Declined | PRODUCTION ERROR | None | T204871 Investigate the spikes of "web request took longer than 60 seconds and timed out" during deployments on HHVM | ||
Resolved | PRODUCTION ERROR | Addshore | T207313 Some administrative and log actions on Wikidata take longer than 60 seconds and time out |
Event Timeline
(Triaging as "Not new" from a production branch perspective (the timeout configuration is separate from that). I'll let the Wikidata team triage and decide how to handle the issue itself.)
I also noticed this when trying to load deleted items (possibly because it shows log information on the page?). It seems to have suddenly started happening in the last day or two. I wasn't having any problems before that.
Not sure if this is related, but this page reports a nearly 7-hour replag (as of this comment) for Wikidata's shard (which strangely is not what Special:DispatchStats is reporting).
All of the links provided seem to result in:
PHP Fatal Error from line 46 of /srv/mediawiki/php-1.32.0-wmf.26/includes/libs/rdbms/database/DatabaseMysqli.php: entire web request took longer than 60 seconds and timed out
They also all seem to have some relation to the log table and the IndexPager and LogPager classes in core.
For example this request https://www.wikidata.org/wiki/Special:Log/addshore result in this sctack:
#0 /srv/mediawiki/php-1.32.0-wmf.26/includes/libs/rdbms/database/DatabaseMysqli.php(46): mysqli::query() #1 /srv/mediawiki/php-1.32.0-wmf.26/includes/libs/rdbms/database/Database.php(1260): Wikimedia\Rdbms\DatabaseMysqli->doQuery(string) #2 /srv/mediawiki/php-1.32.0-wmf.26/includes/libs/rdbms/database/Database.php(1183): Wikimedia\Rdbms\Database->doProfiledQuery(string, string, boolean, string) #3 /srv/mediawiki/php-1.32.0-wmf.26/includes/libs/rdbms/database/Database.php(1693): Wikimedia\Rdbms\Database->query(string, string) #4 /srv/mediawiki/php-1.32.0-wmf.26/includes/pager/IndexPager.php(369): Wikimedia\Rdbms\Database->select(array, array, array, string, array, array) #5 /srv/mediawiki/php-1.32.0-wmf.26/includes/pager/IndexPager.php(226): IndexPager->reallyDoQuery(string, integer, boolean) #6 /srv/mediawiki/php-1.32.0-wmf.26/includes/logging/LogPager.php(442): IndexPager->doQuery() #7 /srv/mediawiki/php-1.32.0-wmf.26/includes/pager/IndexPager.php(423): LogPager->doQuery() #8 /srv/mediawiki/php-1.32.0-wmf.26/includes/specials/SpecialLog.php(248): IndexPager->getBody() #9 /srv/mediawiki/php-1.32.0-wmf.26/includes/specials/SpecialLog.php(137): SpecialLog->show(FormOptions, array) #10 /srv/mediawiki/php-1.32.0-wmf.26/includes/specialpage/SpecialPage.php(569): SpecialLog->execute(string) #11 /srv/mediawiki/php-1.32.0-wmf.26/includes/specialpage/SpecialPageFactory.php(568): SpecialPage->run(string) #12 /srv/mediawiki/php-1.32.0-wmf.26/includes/MediaWiki.php(288): MediaWiki\Special\SpecialPageFactory->executePath(Title, RequestContext) #13 /srv/mediawiki/php-1.32.0-wmf.26/includes/MediaWiki.php(860): MediaWiki->performRequest() #14 /srv/mediawiki/php-1.32.0-wmf.26/includes/MediaWiki.php(517): MediaWiki->main() #15 /srv/mediawiki/php-1.32.0-wmf.26/index.php(42): MediaWiki->run() #16 /srv/mediawiki/w/index.php(3): include(string) #17 {main}
It seems running the request with profiling turned on never reports anything to https://performance.wikimedia.org/xhgui/, perhaps because the request stops? :(
I don't think this actually relates to T204871, other than the fact that it is about requests timing out.
Change 468547 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[operations/mediawiki-config@master] Revert back wikidata for change_tag backend
The very likely cause is T194164: Start reading from change_tag_def in production. We are going to roll this back for wikidata which means tags made between Wed, Oct 17, 1:40 PM until the time of deploy won't be usable after the rollback, but it's temporarily measure until we fix the underlying issue and turn reading from the new column back on, no data is lost, it's just stored in a new format. If such cases is happening for other wikis, feel free to revert them too.
Change 468547 merged by jenkins-bot:
[operations/mediawiki-config@master] Revert back wikidata for change_tag backend
Mentioned in SAL (#wikimedia-operations) [2018-10-19T11:33:09Z] <addshore@deploy1001> Synchronized wmf-config/InitialiseSettings.php: T207313 UBN - Revert back wikidata for change_tag backend (duration: 00m 59s)
All of the example pages linked in the description now load again.
Changing prio to High to avoid this being listed in the lists of UBNs
Moving to announce now as T207313#4680020 may need announcing?
Change 468980 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/core@master] Fix bad join on ChangeTag subquery
Looking at tendril and running the changed sql queries on prod, my patch should fix it.
The old query was like this:
SELECT /* IndexPager::buildQueryInfo (LogPager) */ log_id, log_type, log_action, log_timestamp, log_namespace, log_title, log_params, log_deleted, user_id, user_name, user_editcount, COALESCE( comment_log_comment.comment_text, log_comment ) AS `log_comment_text`, comment_log_comment.comment_data AS `log_comment_data`, comment_log_comment.comment_id AS `log_comment_cid`, log_user, log_user_text, NULL AS `log_actor`, (SELECT GROUP_CONCAT(ct_tag SEPARATOR ', ') FROM `change_tag` WHERE (ct_log_id=log_id) ) AS `ts_tags` FROM `logging` LEFT JOIN `comment` `comment_log_comment` ON ((comment_log_comment.comment_id = log_comment_id)) LEFT JOIN `user` ON ((user_id=log_user)) WHERE (log_action != 'revision') AND (log_action != 'flow-delete-post') AND (log_action != 'flow-delete-topic') AND (log_action != 'flow-restore-post') AND (log_action != 'flow-restore-topic') AND (log_type NOT IN ('spamblacklist', 'titleblacklist', 'abusefilterprivatedetails', 'suppress')) AND log_type IN ('delete', 'move', 'protect') AND log_namespace = '0' AND log_title = 'Q56399360' AND ((log_deleted & 1) = 0) ORDER BY log_timestamp DESC LIMIT 11;
Which is now changed to:
SELECT /* IndexPager::buildQueryInfo (LogPager) */ log_id, log_type, log_action, log_timestamp, log_namespace, log_title, log_params, log_deleted, user_id, user_name, user_editcount, COALESCE( comment_log_comment.comment_text, log_comment ) AS `log_comment_text`, comment_log_comment.comment_data AS `log_comment_data`, comment_log_comment.comment_id AS `log_comment_cid`, log_user, log_user_text, NULL AS `log_actor`, (SELECT GROUP_CONCAT(ctd_name SEPARATOR ', ') FROM `change_tag`, `change_tag_def` WHERE (ct_log_id=log_id) AND (ct_tag_id=ctd_id) ) AS `ts_tags` FROM `logging` LEFT JOIN `comment` `comment_log_comment` ON ((comment_log_comment.comment_id = log_comment_id)) LEFT JOIN `user` ON ((user_id=log_user)) WHERE (log_action != 'revision') AND (log_action != 'flow-delete-post') AND (log_action != 'flow-delete-topic') AND (log_action != 'flow-restore-post') AND (log_action != 'flow-restore-topic') AND (log_type NOT IN ('spamblacklist', 'titleblacklist', 'abusefilterprivatedetails', 'suppress')) AND log_type IN ('delete', 'move', 'protect') AND log_namespace = '0' AND log_title = 'Q56399360' AND ((log_deleted & 1) = 0) ORDER BY log_timestamp DESC LIMIT 11;
but MySQL is pretty dumb in optimizing subqeuries (SELECT GROUP_CONCAT(ctd_name SEPARATOR ', ') FROM change_tag, change_tag_def WHERE (ct_log_id=log_id) AND (ct_tag_id=ctd_id) ) So I changed it to SELECT GROUP_CONCAT(ctd_name SEPARATOR ',') FROM change_tag INNER JOIN change_tag_def ON ((ct_tag_id=ctd_id)) WHERE ct_rc_id=log_id which is pretty performant (Ran it on prod, works fine)
Change 468980 merged by jenkins-bot:
[mediawiki/core@master] Fix bad join on ChangeTag subquery
Change 469114 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/core@wmf/1.32.0-wmf.26] Fix bad join on ChangeTag subquery
Change 469114 merged by jenkins-bot:
[mediawiki/core@wmf/1.32.0-wmf.26] Fix bad join on ChangeTag subquery
Mentioned in SAL (#wikimedia-operations) [2018-10-22T23:59:03Z] <ladsgroup@deploy1001> Synchronized php-1.32.0-wmf.26/includes/changetags/ChangeTags.php: SWAT: [[gerrit:469114|Fix bad join on ChangeTag subquery (T207313)]] (duration: 00m 47s)