Page MenuHomePhabricator

File upload to Commons via UploadWizard fails with "Caught exception of type Wikimedia\Rdbms\DBQueryError"
Closed, DuplicatePublicPRODUCTION ERROR

Description

When trying to upload the (CC BY 4.0) PDF from here to Commons, the progress bar makes it until about 95%, and then I am getting errors of the type

[6813f71f-3560-48b8-b071-deda1b801f18] Caught exception of type Wikimedia\Rdbms\DBQueryError

that result in the upload being interrupted.

Screen Shot 2021-03-24 at 17.11.03.png (960×2 px, 168 KB)
.

A search for the error message on Phabricator yielded T229605, which is marked of a duplicate of T229589, which is marked as resolved, so I think a new ticket is needed.

Details

Stack Trace

from /srv/mediawiki/php-1.36.0-wmf.35/includes/libs/rdbms/database/Database.php(1700)
#0 /srv/mediawiki/php-1.36.0-wmf.35/includes/libs/rdbms/database/Database.php(1678): Wikimedia\Rdbms\Database->getQueryExceptionAndLog(string, integer, string, string)
#1 /srv/mediawiki/php-1.36.0-wmf.35/includes/libs/rdbms/database/Database.php(1244): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#2 /srv/mediawiki/php-1.36.0-wmf.35/includes/libs/rdbms/database/Database.php(2479): Wikimedia\Rdbms\Database->query(string, string, integer)
#3 /srv/mediawiki/php-1.36.0-wmf.35/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->update(string, array, string, string)
#4 /srv/mediawiki/php-1.36.0-wmf.35/includes/libs/rdbms/database/DBConnRef.php(375): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#5 /srv/mediawiki/php-1.36.0-wmf.35/includes/upload/UploadFromChunks.php(289): Wikimedia\Rdbms\DBConnRef->update(string, array, array, string)
#6 /srv/mediawiki/php-1.36.0-wmf.35/includes/upload/UploadFromChunks.php(263): UploadFromChunks->updateChunkStatus()
#7 /srv/mediawiki/php-1.36.0-wmf.35/includes/api/ApiUpload.php(251): UploadFromChunks->addChunk(string, integer, integer)
#8 /srv/mediawiki/php-1.36.0-wmf.35/includes/api/ApiUpload.php(130): ApiUpload->getChunkResult(array)
#9 /srv/mediawiki/php-1.36.0-wmf.35/includes/api/ApiUpload.php(101): ApiUpload->getContextResult()
#10 /srv/mediawiki/php-1.36.0-wmf.35/includes/api/ApiMain.php(1646): ApiUpload->execute()
#11 /srv/mediawiki/php-1.36.0-wmf.35/includes/api/ApiMain.php(616): ApiMain->executeAction()
#12 /srv/mediawiki/php-1.36.0-wmf.35/includes/api/ApiMain.php(587): ApiMain->executeActionWithErrorHandling()
#13 /srv/mediawiki/php-1.36.0-wmf.35/api.php(90): ApiMain->execute()
#14 /srv/mediawiki/php-1.36.0-wmf.35/api.php(45): wfApiMain()
#15 /srv/mediawiki/w/api.php(3): require(string)
#16 {main}

Event Timeline

Peachey88 changed the subtype of this task from "Task" to "Production Error".Mar 24 2021, 11:37 PM

The update is using us_key which is an unique key. It is not the primary key, but there is only one row.
Maybe this needs a DBA to find the other transaction which holds the row.

A possible issue is when the chunks are uploaded in parallel.
But the next chunk could only be uploaded if the current chunk gets a success from the api.

00591: 61/63> in progress Upload: 98%
00601: 61/63> Chunk uploaded
00601: 62/63> in progress Upload: 99%
00610: 62/63> Chunk uploaded
00610: 63/63> in progress Upload: 95%
00620: 63/63> upload is stuck
00620: 63/63> Connection seems to be okay. Waiting one more time...
00625: 63/63> upload is stuck
00625: 63/63> Connection seems to be okay. Waiting one more time...
00630: 63/63> upload is stuck
00630: 63/63> Connection seems to be okay. Waiting one more time...
00635: 63/63> upload is stuck
00635: 63/63> Connection seems to be okay. Waiting one more time...
00640: 63/63> upload is stuck
00640: 63/63> Connection seems to be okay. Waiting one more time...
00645: 63/63> upload is stuck
00645: 63/63> Connection seems to be okay. Waiting one more time...
00650: 63/63> upload is stuck
00650: 63/63> Connection seems to be okay. Waiting one more time...
00655: 63/63> upload is stuck
00655: 63/63> Connection seems to be okay. Waiting one more time...
00660: 63/63> upload is stuck
00660: 63/63> Connection seems to be okay. Waiting one more time...
00665: 63/63> upload is stuck
00665: 63/63> Connection seems to be okay. Waiting one more time...
00670: 63/63> upload is stuck
00670: 63/63> Server error 0 after uploading chunk:
Response:
00670: 63/63> Connection seems to be okay. Re-sending this request.
00670: 63/63> Connection seems to be okay. Re-sending this request. Upload: 100%
00680: 63/63> upload is stuck
00680: 63/63> Connection seems to be okay. Waiting one more time...
00685: 63/63> upload is stuck
00685: 63/63> Connection seems to be okay. Waiting one more time...
00689: FAILED: internal_api_error_DBQueryError: [11111111-11a1-1a1a-11a1-a1aa111aa1a1] Caught exception of type Wikimedia\Rdbms\DBQueryError

I censored the code as I don't know what it stands for. Tried to overwrite https://en.wikipedia.org/wiki/File:Jefferson_Davis_High_School_band_2020.ogv with some better quality. I'll try again but I have a feeling I'll just have to downgrade the quality.

Edit: tried again, same Wikimedia\Rdbms\DBQueryError. Different code (the thing I censored) but this code is probably irrelevant anyway. Trying again now without stash and async.

00598: 61/63> in progress Upload: 99%
00607: 61/63> Chunk uploaded
00607: 62/63> in progress Upload: 100%
00617: 62/63> Chunk uploaded
00617: 63/63> in progress Upload: 100%
00684: 63/63> Server error 504 after uploading chunk: Gateway Timeout
Response: upstream request timeout
00684: 63/63> upload in progress Upload: 100%
00704: FAILED: internal_api_error_DBQueryError: [(censored)] Caught exception of type Wikimedia\Rdbms\DBQueryError

No luck without stash and async. I'll just lower the bitrate. It's visibly worse. Not completely terrible, but visible.

First time I try to upload the <100MiB file I get:

Service Temporarily Unavailable

Our servers are currently under maintenance or experiencing a technical problem. Please try again in a few minutes.

Second time I get:

Could not acquire lock for "mwstore://local-multiwrite/local-public/7/78/Jefferson_Davis_High_School_band_2020.ogv".
Return to Main Page.

Maybe the limit is indeed 100MB and not 100MiB. Shouldn't I get a different error message though? I'll try shaving a few MB off.

Edit: Still can't overwrite, even under 100MB. Alternates between the acquire lock and experiencing a technical problem errors.

Request from (my ip) via cp3052 frontend, Varnish XID 690378672
Error: 503, Backend fetch failed at Thu, 22 Apr 2021 16:35:19 GMT

There are several uploads to Commons using video2commons like https://commons.wikimedia.org/wiki/File:Fire_transport_breaks_through_to_the_fire_20210422_211433.webm that are over 100MB. @Fae just uploaded https://commons.wikimedia.org/wiki/File:Iacobi_Berengari_Carpensis_..._De_fractura_cranii_liber_aureus._Hactenus_desideratus_-_(Jacopo_Berengario_da_Carpi)_(IA_hin-wel-all-00002134-001).pdf which is slightly over 100MB. (IA Query "collection:(additional_collections) date:[1000 TO 1869] " hin-wel-all-00002134-001 Category:Old books in Internet Archive additional collections (COM:IA books#query) (1629 #87533))

I don't know how the upload process for video2commons or IA Query works. Are they different from Special:Upload, UploadWizard and PyWikiBot which all seem to consistently fail to upload anything over 100MB?

Edit: tried again, same Wikimedia\Rdbms\DBQueryError. Different code (the thing I censored) but this code is probably irrelevant anyway. Trying again now without stash and async.

That was April, now it's September so I tried to upload better quality (about 250MB) of File:Jefferson Davis High School band 2020-en.ogv again because how knows.

00417: 61/63> in progress Upload: 99%
00424: 61/63> Chunk uploaded
00424: 62/63> in progress Upload: 98%
00430: 62/63> Chunk uploaded
00430: 63/63> in progress Upload: 99%
00440: 63/63> upload is stuck
00441: 63/63> Connection seems to be okay. Waiting one more time...
00445: 63/63> upload is stuck
00445: 63/63> Connection seems to be okay. Waiting one more time...
00450: 63/63> upload is stuck
00450: 63/63> Connection seems to be okay. Waiting one more time...
00455: 63/63> upload is stuck
00455: 63/63> Connection seems to be okay. Waiting one more time...
00460: 63/63> upload is stuck
00460: 63/63> Connection seems to be okay. Waiting one more time...
00465: 63/63> upload is stuck
00465: 63/63> Connection seems to be okay. Waiting one more time...
00470: 63/63> upload is stuck
00470: 63/63> Connection seems to be okay. Waiting one more time...
00475: 63/63> upload is stuck
00475: 63/63> Connection seems to be okay. Waiting one more time...
00480: 63/63> upload is stuck
00480: 63/63> Connection seems to be okay. Waiting one more time...
00485: 63/63> upload is stuck
00485: 63/63> Connection seems to be okay. Waiting one more time...
00490: 63/63> upload is stuck
00490: 63/63> Server error 0 after uploading chunk: 
Response: 
00490: 63/63> Connection seems to be okay. Re-sending this request.
00490: 63/63> Connection seems to be okay. Re-sending this request. Upload: 100%
00500: 63/63> upload is stuck
00500: 63/63> Connection seems to be okay. Waiting one more time...
00505: 63/63> upload is stuck
00506: 63/63> Connection seems to be okay. Waiting one more time...
00509: FAILED: internal_api_error_DBQueryError: [83b27b81-8dca-4810-b6e7-679408c33788] Caught exception of type Wikimedia\Rdbms\DBQueryError

No change of course. There's a file in https://en.wikipedia.org/wiki/Special:UploadStash but the thumbnail is this:

Not Found

Fetching thumbnail failed: Array ( [0] => Array ( [0] => http-bad-status [1] => 500 [2] => Internal Server Error ) ) URL = http://thumbor.svc.codfw.wmnet:8800/wikipedia/en/thumb/temp/0/0c/20210905222019%21chunkedupload_96b7591c09f6.ogx/220px--20210905222019%21chunkedupload_96b7591c09f6.ogx.jpg

and the file is this:

Internal Server Error

Cannot serve a file larger than 1048576 bytes.

I followed the advice from T94562#1482833 but <s>it doesn't seem to work.</s> The response is either empty (I think) or this:

{"error":{"code":"internal_api_error_LocalFileLockError","info":"[bc1da904-d378-4df4-9e92-ecc2c3f4d564] Caught exception of type LocalFileLockError","errorclass":"LocalFileLockError"},"servedby":"mw2321"}

It did work. Not sure how long it took, but the file was published and intact. (checksum verified)

I get this error while trying to upload this PDF https://archive.org/details/da-capo-press-camera-notes-v-3-4-1899-1901 to https://commons.wikimedia.org/wiki/File:Camera_Notes,_v._3-4,_1899-1901.pdf via Chunked-upload protocol.

02220: 69/74> in progress Upload: 99%
02254: 69/74> Chunk uploaded
02254: 70/74> in progress Upload: 100%
02283: 70/74> Chunk uploaded
02283: 71/74> in progress Upload: 100%
02316: 71/74> Chunk uploaded
02316: 72/74> in progress Upload: 99%
02348: 72/74> Chunk uploaded
02348: 73/74> in progress Upload: 100%
02380: 73/74> Chunk uploaded
02380: 74/74> in progress Upload: 73%
02445: 74/74> Server error 504 after uploading chunk: 
Response: upstream request timeout
02445: 74/74> upload in progress Upload: 100%
02462: FAILED: internal_api_error_DBQueryError: [1565fb95-89bf-46da-80fe-644f7a4f1ef6] Caught exception of type Wikimedia\Rdbms\DBQueryError

Based on T292954 and related tasks I'd say it'd be worthwhile to retry the various failed uploads mentioned in this task. Some or all of them may now succeed, and if any still fail then at least one big potential cause has been eliminated (the locking stuff looks like it could be a separate issue from the one recently fixed in T292954).

00689: FAILED: internal_api_error_DBQueryError: [11111111-11a1-1a1a-11a1-a1aa111aa1a1] Caught exception of type Wikimedia\Rdbms\DBQueryError

I censored the code as I don't know what it stands for.

That's the request ID. Please do not censor it.