Page MenuHomePhabricator

Intermittent transcode error "Invalid frame size: 0x0."
Closed, ResolvedPublic

Description

Sometimes transcodes fail with an error "Invalid frame size: 0x0" because it's literally trying to resize to 0x0 pixels. Probably a failure with metadata fetching on new uploads? Rerunning them works, but it shouldn't happen to begin with.

Found on https://commons.wikimedia.org/wiki/File:Preventing_Adverse_Childhood_Experiences_(ACEs)_Online_Training_Module_2_6of12.webm :

'/usr/bin/ffmpeg' -y -i '/tmp/localcopy_0a493a8ac642.webm' -threads 8 -row-mt 1 -crf '35' -qmin '11' -qmax '51' -vb '320000' -vcodec libvpx-vp9 -auto-alt-ref 1 -lag-in-frames 25 -g '240' -speed 4 -f webm -s 0x0 -an -pass '1' -passlogfile '/tmp/transcode_240p.vp9.webmd6ae3f98deaa.webm.log' /dev/null

Exitcode: 1
Memory: 4194304

ffmpeg version 3.2.12-1~deb9u1+wmf1 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 6.3.0 (Debian 6.3.0-18+deb9u1) 20170516
  configuration: --prefix=/usr --extra-version='1~deb9u1+wmf1' --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libebur128 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
  libavutil      55. 34.101 / 55. 34.101
  libavcodec     57. 64.101 / 57. 64.101
  libavformat    57. 56.101 / 57. 56.101
  libavdevice    57.  1.100 / 57.  1.100
  libavfilter     6. 65.100 /  6. 65.100
  libavresample   3.  1.  0 /  3.  1.  0
  libswscale      4.  2.100 /  4.  2.100
  libswresample   2.  3.100 /  2.  3.100
  libpostproc    54.  1.100 / 54.  1.100
Input #0, matroska,webm, from '/tmp/localcopy_0a493a8ac642.webm':
  Metadata:
    COMPATIBLE_BRANDS: isomiso2avc1mp41
    MAJOR_BRAND     : isom
    MINOR_VERSION   : 512
    ENCODER         : Lavf57.83.100
  Duration: 00:00:37.55, start: -0.007000, bitrate: 330 kb/s
    Stream #0:0: Video: vp9 (Profile 0), yuv420p(tv, progressive), 1920x1080, SAR 1:1 DAR 16:9, 23.98 fps, 23.98 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      HANDLER_NAME    : VideoHandler
      ENCODER         : Lavc57.107.100 libvpx-vp9
      DURATION        : 00:00:37.503000000
    Stream #0:1(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
    Metadata:
      HANDLER_NAME    : SoundHandler
      ENCODER         : Lavc57.107.100 libopus
      DURATION        : 00:00:37.554000000
Invalid frame size: 0x0.

https://logstash.wikimedia.org/goto/3f03aaeea3c88bc1005515d98aaaf46c

Event Timeline

Note this appears to predate the current VP9 stuff, and appears on some VP8 transcodes too.

My suspicion is that this happens when the transcode job gets kicked off before the metadata is written to master AND synced with the slave databases.

My reasoning for that is that I see this a lot on the small dimension transcodes, which are the first ones that get kicked off. If I look at cases where this is occurring, it is where the time_addjob, the time_startwork and the timestamp of the revision all share the same timestamp. For instance: P10153
In the case of P10154 the timestamp of the transcode is even 6 seconds before the revision timestamp.

When you look at where the -s option is generated you'll find it it uses getMaxSizeTransform to access the metadata of the file to find it's original frame size. If the file isn't fully processed yet, this is probably problematic.

Removing task assignee due to inactivity, as this open task has been assigned to the same person for more than two years (see the emails sent to the task assignee on Oct27 and Nov23). Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.
(See https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.)

I think adding a bit of delay by setting the job's releaseTimestamp param might help here...

The job needs to get the dimensions of the original file to calculate the AR corrected frame of the derivative. Alternative is perhaps to pass the original frame as job parameters ? so that we do not have to rely on the original file + db access for that ?

Properties of file used:

  1. exists
  2. sourcefilepath
  3. width/height/length
  4. mediahandler (and its' is interlaced)
TheDJ updated the task description. (Show Details)

This still happens, and it's mildly annoying. ;) We may need to force proper chronology on the database connections used to fetch file data? Anyway, back on the slate for cleanup work :D