Page MenuHomePhabricator

Big transcodes fail with Exitcode: 137 - SIGKILL
Open, Needs TriagePublic

Description

After T155750.

Now there are a few transcodes with time over 8h, but some are still failing: https://commons.wikimedia.org/wiki/File:Janmabhoomi,_1936.webm
https://quarry.wmflabs.org/query/15684
Exitcode: 137
startwork = 20170121151803, error = 20170121235548
8 hours

137-128 = 9 = SIGKILL

Event Timeline

Yann created this task.Jan 24 2017, 9:17 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 24 2017, 9:17 PM
Revent added a subscriber: Revent.Jan 26 2017, 11:26 AM

The relevant table line is https://quarry.wmflabs.org/query/15801 <- error message was 'timeout'

brion added a subscriber: brion.Jan 26 2017, 6:29 PM

Hmm, maybe let's bump the limit up a little further. No sense spending hours transcoding just to fail the file. (Dialing this in is not an exact science; the wall clock and CPU times don't match up on a consistent ratio so you'll see some failures earlier, and others later.)

Change 334401 had a related patch set uploaded (by Brion VIBBER):
Increase video transcode max time from 8 to 16 hours

https://gerrit.wikimedia.org/r/334401

Paladox added a subscriber: Paladox.Feb 2 2017, 2:32 PM
Yann added a comment.EditedMar 10 2017, 4:17 AM

More of the same

Yann added a comment.Mar 18 2017, 3:10 PM

Again more small transcodes fails with Exitcode 137:

TheDJ renamed this task from Big transcodes fail with Exitcode: 137 to Big transcodes fail with Exitcode: 137 - SIGKILL.Mar 20 2017, 9:47 AM
TheDJ updated the task description. (Show Details)

Change 334401 merged by jenkins-bot:
[operations/mediawiki-config@master] Increase video transcode max time from 8 to 16 hours

https://gerrit.wikimedia.org/r/334401