Page MenuHomePhabricator

video2commons stashfailed
Open, Needs TriagePublic

Description

When trying to upload certain video through video2commons, it is failing with:

An exception occurred: TaskError: pywikibot.Error: APIError: stashfailed: Internal error: Server failed to publish temporary file. [help:See https://commons.wikimedia.org/w/api.php for API usage. Subscribe to the mediawiki-api-announce mailing list at <https://lists.wikimedia.org/postorius/lists/mediawiki-api-announce.lists.wikimedia.org/> for notice of API deprecations and breaking changes.]

which is... pretty unspecific. I'm not sure if it's a problem on video2commons side or at Commons. Original file is 366M, manually converting it results in a 620M file. So it's probably not a filesize issue, but I have no further idea on how to debug this.

Event Timeline

This seems to have been given the internal video2commons id 18bc7a39f68a4169

It may indeed be a filesize issue.

Original file is 366 MB (383646205 bytes)
The initial conversion we did resulted in a 620 MB file (649792371 bytes)
However, a second execution, more closely following the video2commons command , resulted in a file of 1.1 GB (1087952597 bytes)

Command used on initial conversion (following Help:Converting_video):

ffmpeg -i "$input" -c:v libvpx-vp9 -b:v 0 -crf 30 -pass 1 -row-mt 1 -an -f webm -y /dev/null &&
ffmpeg -i "$input" -c:v libvpx-vp9 -b:v 0 -crf 30 -pass 2 -row-mt 1 -c:a libopus "$output"`

Command used on second conversion:

ffmpeg -y -i DESTACADOV4.mp4 -max_muxing_queue_size 4096 -threads 16 -row-mt 1 -crf 20 -qmin 1 -qmax 51 -b:v 0 -vcodec libvpx-vp9 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -speed 4 -f webm -ss 0 -an -pass 1 -passlogfile DESTACADOV4.mp4.vp9.webm.log /dev/null
ffmpeg -y -i DESTACADOV4.mp4 -max_muxing_queue_size 4096 -threads 16 -row-mt 1 -crf 20 -qmin 1 -qmax 51 -b:v 0 -vcodec libvpx-vp9 -tile-columns 4 -auto-alt-ref 1 -lag-in-frames 25 -speed 4 -f webm -ss 0 -pass 2 -passlogfile DESTACADOV4.mp4.vp9.webm.log -c:a libopus DESTACADOV4-2.webm

(based on the messages shown by video2commons plus a bit of guessing)

It is noticeable that the second run took about half the time of the initial one in the same hardware. And in both cases it was much, much quicker than in toolforge.

So, if the uploadstash error is indeed because the generated file ended up being too big, it could be expected that it is unable to be uploaded, but the provided error message should ideally be much clearer.

$wgMaxUploadSize is set on InitialiseSettings.php to 4 GB. I was under the impression that it was 1 GB, but it is set to 4 GB per 46e69532.

FWIW, uploading large files is known to fail 'sometimes', with larger file,
the larger the likelihood to fail. Pretty sure it's some bug in chunked
uploading MediaWiki code since v2c is 'sometimes' able to upload 3+GiB just
fine with the exact same code.

iirc, I mentioned this to @bd808 regarding the chunked uploading issue
during hackathon 2019. I don't remember exactly what he said.

Thanks. That's weird. :/

If this is something "common", it might be helpful to provide a path to directly reupload on failure, skipping the slow transcoding process.

On a tangential note: did I guess correctly the last command? I have no idea what are those other ffmpeg options for, or when they are better than the default ones.

That error message is the api-error-publishfailed localization key. It is used in PublishStashedFileJob.php when any exception is caught while moving the stashed file into primary storage.

I looked around in logs for a bit to see if I could get lucky and find the underlying problem. I think that https://logstash.wikimedia.org/goto/cbd3c5bda9f696cee76105ef7d1f5243 might be this failure. The trigger there seems to be a "MySQL server has gone away (db2090)" error while MediaWiki\User\ActorStore::findActorIdInternal was looking up @Platonides' account.

We may submit it again if needed. I would expect findActorIdInternal to be a quick action. I guess the db connection timed out while it was doing all the slow file work so the next db action (which happened to be that findActorIdInternal) failed.

If this is something "common", it might be helpful to provide a path to directly reupload on failure, skipping the slow transcoding process.

I once tried to write a 'resume' code but due to the complexity of storing & referring to large files I never got it done.

did I guess correctly the last command?

Looks fine to me. The main difference to the file size is the crf (constant rate factor) value used.

Another user has gotten "FAILED: stashfailed: Internal error: Server failed to store temporary file." - see edit https://commons.wikimedia.org/w/index.php?title=Commons:Help_desk&diff=prev&oldid=830516577 .