Page MenuHomePhabricator

API uploads fatal with UploadChunkFileException: Error storing file in '/tmp' backend-fail-internal



Request URL:

UploadChunkFileException: Error storing file in '/tmp/phpYHAPWZ': backend-fail-internal; local-swift-codfw
#0 /srv/mediawiki/php-1.34.0-wmf.14/includes/upload/UploadFromChunks.php(275): UploadFromChunks->outputChunk(string)
#1 /srv/mediawiki/php-1.34.0-wmf.14/includes/api/ApiUpload.php(226): UploadFromChunks->addChunk(string, integer, integer)
#2 /srv/mediawiki/php-1.34.0-wmf.14/includes/api/ApiUpload.php(132): ApiUpload->getChunkResult(array)
#3 /srv/mediawiki/php-1.34.0-wmf.14/includes/api/ApiUpload.php(104): ApiUpload->getContextResult()
#4 /srv/mediawiki/php-1.34.0-wmf.14/includes/api/ApiMain.php(1583): ApiUpload->execute()
#5 /srv/mediawiki/php-1.34.0-wmf.14/includes/api/ApiMain.php(531): ApiMain->executeAction()
#6 /srv/mediawiki/php-1.34.0-wmf.14/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#7 /srv/mediawiki/php-1.34.0-wmf.14/api.php(86): ApiMain->execute()
#8 /srv/mediawiki/w/api.php(3): require(string)
#9 {main}


Unknown. Special:NewFiles still shows new files being uploaded, so at least it’s not preventing all uploads.


From logstash:

  • New in 1.34-wmf.14.
  • Affects (naturally).
  • Seen several dozen times already in the short time it's been out.

Event Timeline

LarsWirzenius triaged this task as Unbreak Now! priority.Jul 17 2019, 3:55 PM
Cparle added a subscriber: fgiunchedi.
Cparle added a subscriber: Cparle.

@fgiunchedi I tagged you cos @Gilles is away and I dunno who else to ask about swift ...

:D afaik we've been working almost exclusively on js/ui stuff lately, so I don't think it's us

Adding SRE per SRE-swift-storage / @fgiunchedi

(There's no tag for the Infrastructure Foundations subteam of SRE is there?)

fgiunchedi lowered the priority of this task from Unbreak Now! to Medium.Jul 18 2019, 8:29 AM

The errors from UploadChunkFileException:

Searching for local-swift-codfw on the same time period:

There's a bunch of errors in this form over four minutes

2019-07-17T14:55:39	mw1230	ERROR	HTTP 401 (Unauthorized) in 'SwiftFileBackend::doStoreInternal' (given '{"async":false,"op":"store","src":"/tmp/phpYHAPWZ","dst":"mwstore://local-swift-codfw/local-temp/d/d3/16rd13foxoo4.7etxt1.2927633.jpg.1","headers":[],"overwrite":true}')

Which I believe are due to MW's authentication token to swift expiring, I'm not sure if there's logic to retry and refresh the auth in cases like this though. I doubt it is a newly introduced bug, thus I'm boldly setting priority to normal, not a train blocker IMHO.

@fgiunchedi If this is not blocking the train, please remove the train task from parent tasks.

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:06 PM

This is a 1y+ production error still waiting to be investigated. There is some reason to suspect it might be infrastructure related, but before SRE can help here it will first need to be better understood and quantified what goes wrong in Swift (if indeed that's the case).