Page MenuHomePhabricator

Persistent failure of TMH to transcode videos at specific resolutions
Closed, ResolvedPublic

Description

Over the past few days, Timed Media Handler has consistently refused to transcode a substantial number of videos (I'm estimating around a hundred or so) at certain resolutions, even when reset. The original files are still 'visible' (it's not that bug), and the failures only apply to specific resolutions (that are persistent for the particular file, but not across different files). When reset, the transcode runs for a substantial period of time (around what would be expected), and then fails.

The error message in Quarry is "* An unknown error occurred in storage backend "local-swift-eqiad". * An unknown error occurred in storage backend "local-swift-codfw"."

Examples are:
https://quarry.wmflabs.org/query/18951
https://quarry.wmflabs.org/query/18950
https://quarry.wmflabs.org/query/18947

Event Timeline

Looking at the logs, I find a 413 error in swift.

FileOperation.log-20170529:2017-05-28 15:00:47 [WSqfmApAEDMAAdnL0DgAAABJ] mw1169 commonswiki 1.30.0-wmf.2 FileOperation ERROR: HTTP 413 (Request Entity Too Large) in 'SwiftFileBackend::doStoreInternal' (given '{"async":false,"op":"store","src":"/tmp/transcode_720p.webme9f710ae3af5.webm","dst":"mwstore://local-swift-codfw/local-transcoded/8/82/Wiki-award_2017_02.ogv/Wiki-award_2017_02.ogv.720p.webm","headers":[],"overwrite":true}')

@fgiunchedi, any thoughts ?

Indeed it looks like the upload exceeds the maximum swift file size (5GB by default). Though wgMaxUploadSize is 4GB now a transcoded version might exceed the 5GB swift limit, I'm assuming that's what happened here

a transcoded version needs to use 'generic' transcoding settings, so might easily be larger than an optimised original indeed.

Change 356155 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] tlsproxy: add support to change max_body_size

https://gerrit.wikimedia.org/r/356155

@Revent indeed that fails too, I've poked at it a little bit more and proposed a patch to fix this issue

Change 356157 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] hieradata: set max_body_size for swift::proxy

https://gerrit.wikimedia.org/r/356157

Change 356155 merged by Filippo Giunchedi:
[operations/puppet@production] tlsproxy: add support to change max_body_size

https://gerrit.wikimedia.org/r/356155

Change 356157 merged by Filippo Giunchedi:
[operations/puppet@production] hieradata: set max_body_size for swift::proxy

https://gerrit.wikimedia.org/r/356157

fgiunchedi claimed this task.

@Revent I see for example https://commons.wikimedia.org/wiki/File:Janusz_Cedro_live_in_concert-_Amazing_Grace.ogv working now with the latest transcode at 1080p. I'm tentatively resolving since all transcodes <4GB should just work. Please reopen if you see more issues and thanks for reporting this bug!

@fgiunchedi Thanks for jumping onto this so quickly. I'll start working on resetting the others, and let you know if the problem still exists.

This error message has also appeared while doing server-side uploads: T166806.