Page MenuHomePhabricator

Lock wait timeout exceeded in UploadFromChunks::updateChunkStatus
Open, Needs TriagePublic

Description

After T200820#4826332 I tried again without "stash and async" (as recommended on https://commons.wikimedia.org/wiki/User_talk:Rillke/bigChunkedUpload.js#Troubleshooting ), and I got the following

03054: 287/287> Server error 503 after uploading chunk: 
Response: <!DOCTYPE html>
<html lang="en" dir="ltr">
<meta charset="utf-8">
<title>Wikimedia Error</title>
<style>
* { margin: 0; padding: 0; }
body { background: #fff; font: 15px/1.6 sans-serif; color: #333; }
.content { margin: 7% auto 0; padding: 2em 1em 1em; max-width: 640px; }
img { float: left; margin: 0 2em 2em 0; }
a img { border: 0; }
h1 { margin-top: 1em; font-size: 1.2em; }
p { margin: 0.7em 0 1em 0; }
a { color: #0645AD; text-decoration: none; }
a:hover { text-decoration: underline; }
</style>
03054: 287/287> upload in progress Upload: 100%
03078: FAILED: internal_api_error_DBQueryError: [XBZngQpAAEwAACGSPNYAAACO] Caught exception of type Wikimedia\Rdbms\DBQueryError

Is this the same issue?
File in stash is 1698l1a6s3kw.c4ox7h.1.pdf

Event Timeline

Yann created this task.Dec 17 2018, 7:38 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 17 2018, 7:38 AM
Tgr added a subscriber: Tgr.Dec 17 2018, 8:48 AM
[XBZngQpAAEwAACGSPNYAAACO] /w/api.php   Wikimedia\Rdbms\DBQueryError from line 1506 of /srv/mediawiki/php-1.33.0-wmf.8/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
Query: UPDATE  `uploadstash` SET us_status = 'chunks',us_chunk_inx = '286',us_size = '1201531325' WHERE us_key = '1698l1a6s3kw.c4ox7h.1.pdf'
Function: UploadFromChunks::updateChunkStatus
Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.64.48.23)
#0 /srv/mediawiki/php-1.33.0-wmf.8/includes/libs/rdbms/database/Database.php(1476): Wikimedia\Rdbms\Database->makeQueryException(string, integer, string, string)
#1 /srv/mediawiki/php-1.33.0-wmf.8/includes/libs/rdbms/database/Database.php(1236): Wikimedia\Rdbms\Database->reportQueryError(string, integer, string, string, boolean)
#2 /srv/mediawiki/php-1.33.0-wmf.8/includes/libs/rdbms/database/Database.php(2114): Wikimedia\Rdbms\Database->query(string, string)
#3 /srv/mediawiki/php-1.33.0-wmf.8/includes/upload/UploadFromChunks.php(300): Wikimedia\Rdbms\Database->update(string, array, array, string)
#4 /srv/mediawiki/php-1.33.0-wmf.8/includes/upload/UploadFromChunks.php(273): UploadFromChunks->updateChunkStatus()
#5 /srv/mediawiki/php-1.33.0-wmf.8/includes/api/ApiUpload.php(226): UploadFromChunks->addChunk(string, integer, integer)
#6 /srv/mediawiki/php-1.33.0-wmf.8/includes/api/ApiUpload.php(132): ApiUpload->getChunkResult(array)
#7 /srv/mediawiki/php-1.33.0-wmf.8/includes/api/ApiUpload.php(104): ApiUpload->getContextResult()
#8 /srv/mediawiki/php-1.33.0-wmf.8/includes/api/ApiMain.php(1595): ApiUpload->execute()
#9 /srv/mediawiki/php-1.33.0-wmf.8/includes/api/ApiMain.php(531): ApiMain->executeAction()
#10 /srv/mediawiki/php-1.33.0-wmf.8/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#11 /srv/mediawiki/php-1.33.0-wmf.8/api.php(87): ApiMain->execute()
#12 /srv/mediawiki/w/api.php(3): include(string)
#13 {main}

So there are two separate processes trying to update the same upload status, and one of the locks the row and the other times out.
Not sure how that can happen... maybe the tool does parallel upload of multiple chunks at the same time, but even so, the status update should only happen when the upload of that chunk has finished, so the row should not be locked for long, and a lock timeout takes multiple seconds to happen.

Yann added a comment.Dec 17 2018, 9:00 AM

And this only happens on the last chunk, when trying to build the whole file, which is the same as the other 2 bugs related. I don't think that a coincidence.

Restricted Application added a project: Multimedia. · View Herald TranscriptDec 17 2018, 10:35 AM
Tgr renamed this task from internal_api_error_DBQueryError to Lock wait timeout exceeded in UploadFromChunks::updateChunkStatus.Dec 17 2018, 10:50 AM
Yann added a comment.Jan 4 2019, 12:22 PM

I could upload the 3 files which caused this bug.

Yann added a comment.EditedJan 15 2019, 5:53 PM

This happened again after uploading a 1.9 GB file. :((

04283: 458/458> in progress Upload: 100%
04411: 458/458> Server error 503 after uploading chunk: 
Response: <!DOCTYPE html>
<html lang="en" dir="ltr">
<meta charset="utf-8">
<title>Wikimedia Error</title>
<style>
* { margin: 0; padding: 0; }
body { background: #fff; font: 15px/1.6 sans-serif; color: #333; }
.content { margin: 7% auto 0; padding: 2em 1em 1em; max-width: 640px; }
img { float: left; margin: 0 2em 2em 0; }
a img { border: 0; }
h1 { margin-top: 1em; font-size: 1.2em; }
p { margin: 0.7em 0 1em 0; }
a { color: #0645AD; text-decoration: none; }
a:hover { text-decoration: underline; }
</style>
04411: 458/458> upload in progress Upload: 99%
04431: FAILED: internal_api_error_DBQueryError: [XD4dAwpAIDYAAB0dQIIAAACY] Caught exception of type Wikimedia\Rdbms\DBQueryError

https://commons.wikimedia.org/wiki/File:L%27Illustration,_v.20,_Jul-Dec_1852.pdf
See stash
16bsomjwzu4s.u4y885.1.pdf
16bss6f5rz7g.617yma.1.pdf

Tgr added a comment.Jan 15 2019, 8:34 PM

Same error, lock wait timeout in UploadFromChunks::updateChunkStatus (at chunk 457). Maybe we should add some debug logging for when exactly each chunk starts / ends uploading, to see if the client does anything weird there.

This comment was removed by Wilfredor.
This comment was removed by Wilfredor.
Wilfredor added a comment.EditedJan 16 2019, 2:02 AM

Same problem, I cant upload my images because this problem, also with Uploadwizard with images TIFF with more 600Mb of size (TIFF with internal zip compression).

Failed to load resource: the server responded with a status of 404 () https://commons.wikimedia.org/wiki/Special:UploadStash/thumb/16btxukj8dc8.3faxuw.32521.tif/lossy-page1-120px-16btxukj8dc8.3faxuw.32521.tif.jpg?0.6459894966662545

The sample file that Im trying to upload is:
https://archive.org/download/SacreCoeurDeJesusByLouisJobin1878/Sacre%20Coeur%20de%20Jesus%20by%20Louis%20Jobin%201878.tif

Also the follow error while uploading via URL:

identify command failed: '/usr/bin/identify' -format "[BEGIN]page=%p\nalpha=%A\nalpha2=%r\nheight=%h\nwidth=%w\ndepth=%z[END]" '/tmp/URLb7638a2c66b1.urlupload_' 2>&1
Ankry added a subscriber: Ankry.EditedJan 16 2019, 7:00 AM

executing the above command locally I get:

$ '/usr/bin/identify' -format "[BEGIN]page=%p\nalpha=%A\nalpha2=%r\nheight=%h\nwidth=%w\ndepth=%z[END]" Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif
identify: Incompatible type for "RichTIFFIPTC"; tag ignored. `TIFFFetchNormalTag' @ warning/tiff.c/TIFFWarnings/912.
identify: Unknown field with tag 34864 (0x8830) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/912.
identify: Incompatible type for "FileSource"; tag ignored. `TIFFFetchNormalTag' @ warning/tiff.c/TIFFWarnings/912.
identify: Incompatible type for "SceneType"; tag ignored. `TIFFFetchNormalTag' @ warning/tiff.c/TIFFWarnings/912.
identify: Wrong data type 3 for "GainControl"; tag ignored. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/912.
identify: Unknown field with tag 42033 (0xa431) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/912.
identify: Unknown field with tag 42034 (0xa432) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/912.
identify: Unknown field with tag 42036 (0xa434) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/912.
identify: Incompatible type for "RichTIFFIPTC"; tag ignored. `TIFFFetchNormalTag' @ warning/tiff.c/TIFFWarnings/912.
identify: DistributedPixelCache '127.0.0.1' @ error/distribute-cache.c/ConnectPixelCacheServer/244.
identify: cache resources exhausted `Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif' @ error/cache.c/OpenPixelCache/3945.
identify: memory allocation failed `Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif' @ error/psd.c/ReadPSDLayers/1737.
[BEGIN]page=0
alpha=False
alpha2=DirectClass sRGB
height=8912
width=6160
$ identify --version
Version: ImageMagick 6.9.7-4 Q16 x86_64 20170114 http://www.imagemagick.org
Copyright: © 1999-2017 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC Modules OpenMP
Delegates (built-in): bzlib djvu fftw fontconfig freetype jbig jng jp2 jpeg lcms lqr ltdl lzma openexr pangocairo png tiff wmf x xml zlib

Out of memory?
Any hints what else can we do here?

Ankry added a comment.EditedJan 16 2019, 7:10 AM

The above was on a machine with 8GB RAM + 8GB swap. On a machine with 100GB RAM I get:

$ '/usr/bin/identify' -format "[BEGIN]page=%p\nalpha=%A\nalpha2=%r\nheight=%h\nwidth=%w\ndepth=%z[END]" Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif
identify: Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif: wrong data type 7 for "RichTIFFIPTC"; tag ignored. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/768.
identify: Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif: unknown field with tag 34864 (0x8830) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/768.
identify: Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif: wrong data type 3 for "GainControl"; tag ignored. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/768.
identify: Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif: unknown field with tag 42033 (0xa431) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/768.
identify: Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif: unknown field with tag 42034 (0xa432) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/768.
identify: Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif: unknown field with tag 42036 (0xa434) encountered. `TIFFReadCustomDirectory' @ warning/tiff.c/TIFFWarnings/768.
identify: Sacre_Coeur_de_Jesus_by_Louis_Jobin_1878.tif: wrong data type 7 for "RichTIFFIPTC"; tag ignored. `TIFFReadDirectory' @ warning/tiff.c/TIFFWarnings/768.
[BEGIN]page=0
alpha=False
alpha2=DirectClass sRGB
height=8912
width=6160
depth=16[END]
$ echo $?
0
$ identify --version
Version: ImageMagick 6.7.7-10 2014-03-08 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2012 ImageMagick Studio LLC
Features: OpenMP

So this is likely OOM problem.

Tgr added a comment.Jan 16 2019, 8:26 PM

Let's keep this task to be about the lock wait timeout error, please.