Page MenuHomePhabricator

Video transcoding fails when firejail is enabled
Closed, ResolvedPublic5 Estimated Story PointsPRODUCTION ERROR

Description

Error

Error message:

full_message
"Executing: /bin/bash '/srv/mediawiki/php-1.42.0-wmf.17/vendor/wikimedia/shellbox/src/Command/limit.sh' ''\\''/usr/bin/firejail'\\'' '\\''--quiet'\\'' '\\''--profile=/srv/mediawiki/php-1.42.0-wmf.17/includes/shell/firejail.profile'\\'' '\\''--seccomp'\\'' '\\''--net=none'\\'' '\\''--env=TMH_OUTPUT_FILE=transcoded.webm'\\'' '\\''--env=TMH_FFMPEG_PASSES=2'\\'' '\\''--env=TMH_FFMPEG_PATH=/usr/bin/ffmpeg'\\'' '\\''--env=TMH_OPTS_VIDEO= -threads 8 -row-mt 1 -pix_fmt yuv420p -crf 31 -vcodec libvpx-vp9 -tile-columns 2 -quality good -f webm -g 599 -b:v 5952529 -minrate 2975515 -maxrate 8629743 -s 1920x1080'\\'' '\\''--env=TMH_REMUX=no'\\'' '\\''--env=TMH_OPT_VIDEOCODEC='\\''\\'\\'''\\''vp9'\\''\\'\\'''\\'''\\'' '\\''--env=TMH_OPT_SPEED=3'\\'' '\\''--env=TMH_OPTS_FFMPEG2=-max_muxing_queue_size 1024'\\'' '\\''--env=TMH_OPT_NOAUDIO=no'\\'' '\\''--env=TMH_OPTS_AUDIO= -ab 96000 -acodec libopus'\\'' '\\''--env=TMH_MOVFLAGS='\\'' -- '\\''/bin/bash'\\'' '\\''scripts/ffmpeg-encode.sh'\\''' 'SB_INCLUDE_STDERR=1;SB_CPU_LIMIT=50; SB_CGROUP='\\''/sys/fs/cgroup/memory/mediawiki/job'\\''; SB_MEM_LIMIT=4194304; SB_FILE_SIZE_LIMIT=536870912; SB_WALL_CLOCK_LIMIT=180; SB_USE_LOG_PIPE=yes' 2>&1"
Impact

Video transcoding is broken, we won't create successful transclusions of uploaded videos

Notes

This can probably be solved in two ways:

  • "Fixing" the video transclusion script to use less elaborate variables
  • Fix the escaping of --env arguments in Shellbox::Command::FirejailWrapper::wrap

Related Objects

Event Timeline

Joe triaged this task as Unbreak Now! priority.Tue, Feb 6, 3:27 PM
Joe created this task.
Joe created this object with edit policy "Custom Policy".
Joe updated the task description. (Show Details)

I think the '\\''--env=TMH_OPT_VIDEOCODEC='\\''\\'\\'''\\''vp9'\\''\\'\\'''\\'''\\'' '\\'' looks quite suspicious here.

It looks like we should just pass that without escaping, given it's not even a proper user input.

I think the '\\''--env=TMH_OPT_VIDEOCODEC='\\''\\'\\'''\\''vp9'\\''\\'\\'''\\'''\\'' '\\'' looks quite suspicious here.

It looks like we should just pass that without escaping, given it's not even a proper user input.

Comes from:

WebVideoTranscodeJob.php line 566
$optsEnv['TMH_OPT_VIDEOCODEC'] = $this->ensureShellSafe( $options['videoCodec'] );

That is not the problem, or not the only one for what it's worth; I've tested encoding an audio file and I get the same error, and there is no TMH_OPT_VIDEOCODEC set there (see reqId in Logstash).

Now, the code in Shellbox::Command::FirejailWrapper::wrap is pretty simple:

foreach ( $command->getEnvironment() as $name => $value ) {
        $cmd[] = "--env=$name=$value";
}

$builtCmd = Shellbox::escape( $cmd );

and it seems generally innocuous. Firejail seems also pretty liberal with what it accepts.

Digging further into this.

Hah, I found the issue.

Shellbox::Command::FirejailWrapper::wrap should discard any empty env entries.

What causes the problem in both cases is TMH_MOV_FLAGS being empty.

The fix, as far as TMH is concerned, is pretty simple; however I think we should really fix the shellbox wrapper.

Change 997905 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[mediawiki/extensions/TimedMediaHandler@master] Do not add env variables when they're empty

https://gerrit.wikimedia.org/r/997905

Change 997905 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@master] Do not add env variables when they're empty

https://gerrit.wikimedia.org/r/997905

Change 997873 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] Do not add env variables when they're empty

https://gerrit.wikimedia.org/r/997873

Change 997937 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[mediawiki/libs/Shellbox@master] FirejailWrapper: don't add empty env variables

https://gerrit.wikimedia.org/r/997937

Change 997873 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] Do not add env variables when they're empty

https://gerrit.wikimedia.org/r/997873

Mentioned in SAL (#wikimedia-operations) [2024-02-06T18:30:56Z] <oblivian@deploy2002> Started scap: Backport for [[gerrit:997873|Do not add env variables when they're empty (T356780)]]

Mentioned in SAL (#wikimedia-operations) [2024-02-06T18:32:38Z] <oblivian@deploy2002> oblivian: Backport for [[gerrit:997873|Do not add env variables when they're empty (T356780)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-02-06T18:42:53Z] <oblivian@deploy2002> Finished scap: Backport for [[gerrit:997873|Do not add env variables when they're empty (T356780)]] (duration: 11m 57s)

While the original issue is solved, we're running into a new one - now the script exits with an OOM, and unless I'm reading it incorrectly, the memory limit seems to be just 4 MB which would explain the problem.

I'll re-check tomorrow morning, but for now this is de facto a blocker for moving to group1.

Change 997944 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[mediawiki/extensions/TimedMediaHandler@master] Set the memory limit in bytes.

https://gerrit.wikimedia.org/r/997944

Hm, $wgTranscodeBackgroundMemoryLimit is specified in KiB according to the extension.json doc comment, and the default is 2097152 (2 GiB)

Change 997944 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@master] Set the memory limit in bytes.

https://gerrit.wikimedia.org/r/997944

Change 997879 had a related patch set uploaded (by Jforrester; author: Giuseppe Lavagetto):

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] Set the memory limit in bytes.

https://gerrit.wikimedia.org/r/997879

Change 997879 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] Set the memory limit in bytes.

https://gerrit.wikimedia.org/r/997879

Mentioned in SAL (#wikimedia-operations) [2024-02-07T07:47:34Z] <oblivian@deploy2002> Started scap: Backport for [[gerrit:997879|Set the memory limit in bytes. (T356780)]]

Mentioned in SAL (#wikimedia-operations) [2024-02-07T07:49:20Z] <oblivian@deploy2002> oblivian and jforrester: Backport for [[gerrit:997879|Set the memory limit in bytes. (T356780)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-02-07T07:57:11Z] <oblivian@deploy2002> Finished scap: Backport for [[gerrit:997879|Set the memory limit in bytes. (T356780)]] (duration: 09m 36s)

Change 998255 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[mediawiki/extensions/TimedMediaHandler@master] WebVideoTranscodeJob: also add time limits

https://gerrit.wikimedia.org/r/998255

Change 998255 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@master] WebVideoTranscodeJob: also add time limits

https://gerrit.wikimedia.org/r/998255

Change 998318 had a related patch set uploaded (by Giuseppe Lavagetto; author: Giuseppe Lavagetto):

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] WebVideoTranscodeJob: also add time limits

https://gerrit.wikimedia.org/r/998318

Change 998318 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] WebVideoTranscodeJob: also add time limits

https://gerrit.wikimedia.org/r/998318

Mentioned in SAL (#wikimedia-operations) [2024-02-07T15:24:51Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 Started scap: Backport for [[gerrit:998318|WebVideoTranscodeJob: also add time limits (T356780)]]

Mentioned in SAL (#wikimedia-operations) [2024-02-07T15:26:25Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 lucaswerkmeister-wmde and oblivian: Backport for [[gerrit:998318|WebVideoTranscodeJob: also add time limits (T356780)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-02-07T15:32:40Z] <logmsgbot> lucaswerkmeister-wmde@deploy2002 Finished scap: Backport for [[gerrit:998318|WebVideoTranscodeJob: also add time limits (T356780)]] (duration: 07m 48s)

My spot tests show transcodes now work on testwiki. I'll let @brion take a look as well before I declare this bug resolved.

Change 998529 had a related patch set uploaded (by Brion VIBBER; author: Brion VIBBER):

[mediawiki/extensions/TimedMediaHandler@master] Fix regression in HLS track content type

https://gerrit.wikimedia.org/r/998529

Looks good except that mime type regression, easy fix in gerrit above ^

Change 998529 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@master] Fix regression in HLS track content type

https://gerrit.wikimedia.org/r/998529

Change 998452 had a related patch set uploaded (by Brennen Bearnes; author: Brion VIBBER):

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] Fix regression in HLS track content type

https://gerrit.wikimedia.org/r/998452

Change 998452 merged by jenkins-bot:

[mediawiki/extensions/TimedMediaHandler@wmf/1.42.0-wmf.17] Fix regression in HLS track content type

https://gerrit.wikimedia.org/r/998452

Mentioned in SAL (#wikimedia-operations) [2024-02-07T20:33:18Z] <brennen@deploy2002> Started scap: Backport for [[gerrit:998452|Fix regression in HLS track content type (T356780)]]

Mentioned in SAL (#wikimedia-operations) [2024-02-07T20:37:02Z] <brennen@deploy2002> brennen: Backport for [[gerrit:998452|Fix regression in HLS track content type (T356780)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-02-07T20:43:39Z] <brennen@deploy2002> Finished scap: Backport for [[gerrit:998452|Fix regression in HLS track content type (T356780)]] (duration: 10m 20s)

From IRC:

<bvibber> brennen: confirmed fixed on test :D

Change 997937 merged by jenkins-bot:

[mediawiki/libs/Shellbox@master] FirejailWrapper: don't add empty env variables

https://gerrit.wikimedia.org/r/997937