Page MenuHomePhabricator

Wikidata: Special:Contributions times out for many high-activity users
Closed, ResolvedPublic

Description

I am trying to view KrBot's contributions on Wikidata. I can get as far back as the last 5,000, which is the maximum number of contributions that can be shown on one page. However, loading the 5,001st contribution results in a timeout. Surprisingly, loading only the 2nd contribution also results in a timeout, even though this is not an issue if the most recent edit is at the top of the list. Using an offset date in 2009 (before Wikidata's creation) also results in a timeout for KrBot.

This is an issue for users with as few as 160,000 edits to Wikidata. For comparison, on the English Wikipedia, this is not an issue even for users such as Koavf (approximately 2,020,930 edits).

Event Timeline

Jc86035 created this task.Apr 18 2019, 3:20 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 18 2019, 3:20 PM
Jc86035 triaged this task as Unbreak Now! priority.EditedApr 18 2019, 3:53 PM

I'm upgrading the priority, since this seems to be fairly fundamental functionality and shouldn't be resulting in a timeout.

However, KrBot has made approximately 28,263,132 edits, which is too many edits for https://xtools.wmflabs.org/sc to count. Timeouts for limit=1 requests – with the offset value being today – also happen for several other Wikidata users, including Jura1 (approximately 10,466,913 edits), DeltaBot (964,929 edits), Moebeus (605,322), Valentina.Anitnelav (459,982) and Pigsonthewing (164,164). This does not happen for Andrew Gray (137,862), me (100,541) or Lea Lacroix (WMDE) (4,145). I would expect, at least, that queries for non-bot accounts don't result in timeouts on Special:Contributions.

Restricted Application added subscribers: Liuxinyu970226, TerraCodes. · View Herald TranscriptApr 18 2019, 3:53 PM
Jc86035 updated the task description. (Show Details)Apr 18 2019, 3:54 PM
Jc86035 updated the task description. (Show Details)
Jc86035 updated the task description. (Show Details)Apr 18 2019, 3:57 PM
Jc86035 renamed this task from Special:Contributions "entire web request took longer than 60 seconds and timed out" to Wikidata: Special:Contributions "entire web request took longer than 60 seconds and timed out".Apr 18 2019, 4:06 PM
Jc86035 updated the task description. (Show Details)
Jc86035 renamed this task from Wikidata: Special:Contributions "entire web request took longer than 60 seconds and timed out" to Wikidata: Special:Contributions timeouts for many high-activity users.Apr 18 2019, 4:08 PM
Jc86035 renamed this task from Wikidata: Special:Contributions timeouts for many high-activity users to Wikidata: Special:Contributions times out for many high-activity users.Apr 18 2019, 4:43 PM
Jc86035 updated the task description. (Show Details)Apr 18 2019, 5:06 PM
Jc86035 updated the task description. (Show Details)Apr 18 2019, 5:15 PM
Jc86035 added a comment.EditedApr 18 2019, 5:37 PM

It turns out that the server doesn't abandon the query if wget immediately retries the request, even though the server returns a 500 error.

The below results are time, in seconds, for one contribution at each offset. An asterisk indicates that wget retried once.

offsetJc86035PigsonthewingValentina.Anitnelav
20090418000000246049
20120418000000244389*
20150418000000215888*
201704180000002273*56
201804180000002174*120*
20190418000000236059

This task is missing the exception hash which would make the error easier to be traced and fixed.

This comment was removed by Jc86035.
Jc86035 added a comment.EditedApr 21 2019, 12:43 PM

There's no error code visible on the page or in the page source. The issue occurs because of the timeout, and the timeout is deliberately executed by the server, so it wouldn't make sense to have an exception hash.

This task is missing the exception hash which would make the error easier to be traced and fixed.

Error is caught by varnish, not mediawiki itself. The only information present is entire web request took longer than 60 seconds and timed out.

I was able to reproduce one:

[XLxsMQpAADsAAImQla8AAACB] 2019-04-21 13:13:33: Kritická výjimka typu „WMFTimeoutException“
message
[XLxsMQpAADsAAImQla8AAACB] /w/index.php?title=Special:Contributions&offset=20190418104744&limit=1&target=KrBot   WMFTimeoutException from line 39 of /srv/mediawiki/wmf-config/set-time-limit.php: the execution time limit of 60 seconds was exceeded
trace
#0 /srv/mediawiki/php-1.34.0-wmf.1/includes/exception/MWExceptionHandler.php(196): {closure}(integer)
#1 [internal function]: MWExceptionHandler::handleError(integer, string, string, integer, array)
#2 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/DatabaseMysqli.php(46): mysqli->query(string)
#3 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/Database.php(1322): Wikimedia\Rdbms\DatabaseMysqli->doQuery(string)
#4 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/Database.php(1224): Wikimedia\Rdbms\Database->attemptQuery(string, string, boolean, string)
#5 /srv/mediawiki/php-1.34.0-wmf.1/includes/libs/rdbms/database/Database.php(1784): Wikimedia\Rdbms\Database->query(string, string)
#6 /srv/mediawiki/php-1.34.0-wmf.1/includes/specials/pagers/ContribsPager.php(205): Wikimedia\Rdbms\Database->select(array, array, array, string, array, array)
#7 /srv/mediawiki/php-1.34.0-wmf.1/includes/pager/IndexPager.php(249): ContribsPager->reallyDoQuery(string, integer, boolean)
#8 /srv/mediawiki/php-1.34.0-wmf.1/includes/pager/IndexPager.php(602): IndexPager->doQuery()
#9 /srv/mediawiki/php-1.34.0-wmf.1/includes/specials/SpecialContributions.php(236): IndexPager->getNumRows()
#10 /srv/mediawiki/php-1.34.0-wmf.1/includes/specialpage/SpecialPage.php(569): SpecialContributions->execute(NULL)
#11 /srv/mediawiki/php-1.34.0-wmf.1/includes/specialpage/SpecialPageFactory.php(578): SpecialPage->run(NULL)
#12 /srv/mediawiki/php-1.34.0-wmf.1/includes/MediaWiki.php(288): MediaWiki\Special\SpecialPageFactory->executePath(Title, RequestContext)
#13 /srv/mediawiki/php-1.34.0-wmf.1/includes/MediaWiki.php(865): MediaWiki->performRequest()
#14 /srv/mediawiki/php-1.34.0-wmf.1/includes/MediaWiki.php(515): MediaWiki->main()
#15 /srv/mediawiki/php-1.34.0-wmf.1/index.php(42): MediaWiki->run()
#16 /srv/mediawiki/w/index.php(3): require(string)
#17 {main}

And this doesn't tell much. OTOH, I could find the culprit query on tendril, at least the one for KrBot's contributions with offset = 20190418104744 and limit 1. Given that I couldn't find any private data in there, here is the paste:

SELECT /* IndexPager::buildQueryInfo (contributions page unfiltered) */ rev_page, rev_minor_edit, rev_deleted, rev_len, rev_parent_id, rev_sha1, comment_rev_comment.comment_text AS `rev_comment_text`, comment_rev_comment.comment_data AS `rev_comment_data`, comment_rev_comment.comment_id AS `rev_comment_cid`, actor_rev_user.actor_user AS `rev_user`, actor_rev_user.actor_name AS `rev_user_text`, temp_rev_user.revactor_actor AS `rev_actor`, page_namespace, page_title, page_id, page_latest, page_is_redirect, page_len, user_name, page_is_new, revactor_timestamp AS `rev_timestamp`, revactor_rev AS `rev_id`, (SELECT GROUP_CONCAT(ctd_name SEPARATOR ', ') FROM `change_tag` JOIN `change_tag_def` ON ((ct_tag_id=ctd_id)) WHERE ct_rev_id=rev_id ) AS `ts_tags`, ores_damaging_cls.oresc_probability AS `ores_damaging_score`, 0.387 AS `ores_damaging_threshold` FROM `revision` JOIN `revision_comment_temp` `temp_rev_comment` ON ((temp_rev_comment.revcomment_rev = rev_id)) JOIN `comment` `comment_rev_comment` ON ((comment_rev_comment.comment_id = temp_rev_comment.revcomment_comment_id)) JOIN `revision_actor_temp` `temp_rev_user` FORCE INDEX (actor_timestamp) ON ((temp_rev_user.revactor_rev = rev_id)) JOIN `actor` `actor_rev_user` ON ((actor_rev_user.actor_id = temp_rev_user.revactor_actor)) JOIN `page` ON ((page_id = rev_page)) LEFT JOIN `user` ON ((actor_rev_user.actor_user != 0) AND (user_id = actor_rev_user.actor_user)) LEFT JOIN `ores_classification` `ores_damaging_cls` ON (ores_damaging_cls.oresc_model = '11' AND (ores_damaging_cls.oresc_rev=rev_id) AND ores_damaging_cls.oresc_class = '1') WHERE ((temp_rev_user.revactor_actor = '122')) AND ((rev_deleted & 12) != 12) AND (rev_timestamp>='20190418104744') ORDER BY rev_timestamp, rev_id LIMIT 1

I'd like to provide an EXPLAIN for it, but the revision_comment_temp table is not available on quarry, nor in the sql optimizer.

This is the explain for wikidatawiki:

+------+--------------------+---------------------+--------+----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------------+----------+--------------------------+
| id   | select_type        | table               | type   | possible_keys                                                        | key                   | key_len | ref                                                 | rows     | Extra                    |
+------+--------------------+---------------------+--------+----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------------+----------+--------------------------+
|    1 | PRIMARY            | actor_rev_user      | const  | PRIMARY                                                              | PRIMARY               | 8       | const                                               |        1 |                          |
|    1 | PRIMARY            | user                | const  | PRIMARY                                                              | PRIMARY               | 4       | const                                               |        1 |                          |
|    1 | PRIMARY            | temp_rev_user       | ref    | actor_timestamp                                                      | actor_timestamp       | 8       | const                                               | 45815350 | Using where; Using index |
|    1 | PRIMARY            | revision            | eq_ref | PRIMARY,rev_timestamp,page_timestamp,page_user_timestamp,rev_page_id | PRIMARY               | 4       | wikidatawiki.temp_rev_user.revactor_rev             |        1 | Using where              |
|    1 | PRIMARY            | ores_damaging_cls   | eq_ref | oresc_rev_model_class                                                | oresc_rev_model_class | 7       | wikidatawiki.temp_rev_user.revactor_rev,const,const |        1 |                          |
|    1 | PRIMARY            | page                | eq_ref | PRIMARY                                                              | PRIMARY               | 4       | wikidatawiki.revision.rev_page                      |        1 |                          |
|    1 | PRIMARY            | temp_rev_comment    | ref    | PRIMARY,revcomment_rev                                               | PRIMARY               | 4       | wikidatawiki.temp_rev_user.revactor_rev             |        1 | Using index              |
|    1 | PRIMARY            | comment_rev_comment | eq_ref | PRIMARY                                                              | PRIMARY               | 8       | wikidatawiki.temp_rev_comment.revcomment_comment_id |        1 |                          |
|    2 | DEPENDENT SUBQUERY | change_tag          | ref    | change_tag_rev_tag_id,change_tag_tag_id_id                           | change_tag_rev_tag_id | 5       | wikidatawiki.revision.rev_id                        |        1 | Using index              |
|    2 | DEPENDENT SUBQUERY | change_tag_def      | eq_ref | PRIMARY                                                              | PRIMARY               | 4       | wikidatawiki.change_tag.ct_tag_id                   |        1 |                          |
+------+--------------------+---------------------+--------+----------------------------------------------------------------------+-----------------------+---------+-----------------------------------------------------+----------+--------------------------+
10 rows in set (0.01 sec)

I have been talking to @Anomie about this issue, and it is believed that it doesn't seem to be the same thing as T221458 or T220999, but it is related to the actor change.
Looks like that for the where on rev_timestamp>='20190418104744 it could use the denormalized revactor_timestamp field to filter the temp_rev_user line.
@Anomie thinks that there are two potential fixes:

  • change the WHERE clause to reference the denormalized field directly
  • change the JOIN to add a redundant equality revactor_timestamp = rev_timestamp which seems to work to let it know it can propagate the WHERE condition to the denormalized field.

Eventually the revision_actor_temp will be killed once the migration is fully done, but that is not going to happen soonish.

Change 505814 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/core@master] ContribsPager: Add redundant JOIN cond to allow WHERE propagation

https://gerrit.wikimedia.org/r/505814

Tarrow added a subscriber: Tarrow.Apr 29 2019, 8:59 AM
WMDE-leszek lowered the priority of this task from Unbreak Now! to High.Apr 29 2019, 10:07 AM
WMDE-leszek added a subscriber: WMDE-leszek.

Pages linked in the task description still timeout for me, so the issue does not seem to have gone away itself.
I date to lower the priority though, hence it has been Unbreak Now for 10 days already. Please feel free to raise the priority back up if it is critical. If this is the case, the issue should be addressed more actively, this from by-passer's perspective has been the case so far.

We've briefly look at the problem from Wikidata perspective. Are we correct in thinking the issue does not lie in Wikibase, and it is something which is to be resolved in the MW core?

Change 505814 merged by jenkins-bot:
[mediawiki/core@master] ContribsPager: Fix slow queries

https://gerrit.wikimedia.org/r/505814

Tgr added a subscriber: Tgr.May 1 2019, 4:47 AM

This is now fixed in master (thanks @Anomie!) and will be deployed Wednesday next week. I think the error is not so bad that it would require an emergency deploy?

Lucas_Werkmeister_WMDE closed this task as Resolved.May 10 2019, 5:29 PM
Lucas_Werkmeister_WMDE assigned this task to Anomie.

Seems to be resolved now.

mmodell changed the subtype of this task from "Task" to "Production Error".Wed, Aug 28, 11:07 PM