Page MenuHomePhabricator

Special:UploadStash timeout with large upload (slow query: SELECT FROM uploadstash)
Closed, ResolvedPublicBUG REPORT

Description

Special:UploadStash times out on Commons for me after a failed 170 file upload via the upload wizard (due to poor UX, see T335628). Total file size is about 2 GB of PDFs, the exact files were the first 170 PDFs listed here: https://www.govinfo.gov/collection/january-6th-committee-final-report?path=/gpo/January%206th%20Committee%20Final%20Report%20and%20Supporting%20Materials%20Collection/Supporting%20Materials%20-%20Transcribed%20Interviews%20and%20Depositions . I want to resume this upload via Special:UploadStash. Instead, I get:

[0c6c9288-3566-4f12-88ca-88dac20373a0] 2023-04-29 19:13:47: Fatal exception of type "Wikimedia\RequestTimeout\RequestTimeoutException"

Event Timeline

Umherirrender subscribed.

The special page does a SELECT us_key FROM uploadstash WHERE us_user = 1 without limit and does not allow pagination. There is a index on us_user, so that is not the problem.

But for every entry of the stash the metadata are looked up in the database with SELECT us_user,us_key,us_orig_path,us_path,us_props,us_size,us_sha1,us_mime,us_media_type,us_image_width,us_image_height,us_image_bits,us_source_type,us_timestamp,us_status FROM uploadstash WHERE us_key = 'xxxxxx.tmp' LIMIT 1, maybe also some interaction with the file backend is done, which can take long time with many files

On commons the stash is stored for 48 hours, so the problem with the timeout should be go away, when the files goes away. But that does not help to resume something with the special page.

There is at least a list module in the api which could make it possible to look up the files https://commons.wikimedia.org/w/api.php?modules=query%2Bmystashedfiles, but that require some skills to request and use.

Krinkle renamed this task from Special:UploadStash timeout with large upload to Special:UploadStash timeout with large upload (slow query: SELECT FROM uploadstash).Jul 13 2023, 3:25 PM
Krinkle moved this task from Untriaged to Apr 2023 on the Wikimedia-production-error board.

Change 1003523 had a related patch set uploaded (by Umherirrender; author: Umherirrender):

[mediawiki/core@master] specials: Use a pager on Special:UploadStash

https://gerrit.wikimedia.org/r/1003523

Change 1003523 merged by jenkins-bot:

[mediawiki/core@master] specials: Use a pager on Special:UploadStash

https://gerrit.wikimedia.org/r/1003523

Umherirrender claimed this task.

The special pages now shows only a limited result and should not have timeouts as in this task.
If the page cannot be open with the default limit it is okay to set a ?limit=10 on the url to get less results, up to limit=1 and paginate through the result if needed.

The database index for the user does not include the timestamp, so this could still end in slow queries due to the sorting on the timestamp, tracking under T358521.