Page MenuHomePhabricator

Please upload large file to Wikimedia Commons
Closed, ResolvedPublic

Description

Please upload this file to Wikimedia Commons using the filename "Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (1).webm":
https://tools.wmflabs.org/videoconvert/server/php/files/2cb17a558201eaa223299cf7e19f74f3/convert/Jeonbuk%20Memorial%20altar%20for%2070th%20anniversary%20of%20the%20Jeju%20April%203rd%20Incident%20%281%29.webm
Please use the following description:

== {{int:filedesc}} ==
{{Information
|Description={{de|1=Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (1)}}
{{en|1=Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (1)}}
{{sv|1=Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (1)}}
|Source={{own}}
|Date=2018-04-05
|Author= [[User:고려|고려]]
|Permission=
|other_versions=
|other_fields=
}}
{{Location dec|12.345|23.456|}}
== {{int:license-header}} ==
{{self|cc-by-4.0}}

[[Category:Videos]]
[[Category:Videos from South Korea]]
[[Category:Videos in South Korea]]
[[Category:Files by User:고려]]

[[Category:Uploaded with videoconvert/Server-side uploads]]

Thank you!

Event Timeline

Dereckson moved this task from Backlog to Working on on the Wikimedia-Site-requests board.

Downloading to Terbium, will take a long time.

We need to figure another way to do this, as the files are heavier than 4.29 Gb:

Terbium
$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=고려 --overwrite --skip-dupes .
Import Images

Importing Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (1).webm...failed. (* Could not write file "mwstore://local-swift-eqiad/local-public/4/47/Jeonbuk_Memorial_altar_for_70th_anniversary_of_the_Jeju_April_3rd_Incident_(1).webm" because it is larger than {{PLURAL:4294967296|one byte|4294967296 bytes}}.
* Could not store file "./Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (1).webm" at "mwstore://local-swift-eqiad/local-public/4/47/Jeonbuk_Memorial_altar_for_70th_anniversary_of_the_Jeju_April_3rd_Incident_(1).webm".
)
^C[Mon Apr  9 14:17:59 2018] [hphp] [6166:7f2135c86200:0:000001] [] Lost parent, LightProcess exiting
[Mon Apr  9 14:17:59 2018] [hphp] [6167:7f2135c86200:0:000001] [] Lost parent, LightProcess exiting

If you're unable to upload the files via terbium maybe the requestor should send a hard disk to the datacenter as explained here.

The hard disk procedure is to transfer files to Terbium.

Regardless on how files are attached on Terbium (through external hard disk or regular file transfer), the next step is to write the file to swift.

Increasing the limit to the swift max (5GB) is probably doable with not so much fuss (not 100% sure on that though), but once you're past that, MediaWiki needs support for swift large objects (https://docs.openstack.org/swift/latest/overview_large_objects.html) which we don't have at all yet. Also I imagine that once you start getting past that point, people are going to want to chime in about if disk usage is still good if we allow much bigger files, and if streaming them through varnish still works good enough? (I'm guessing. I don't know much about these things and may be wrong)

@zhuyifei1999 Could you reencode this file too at < 4.3 Gb target size?

@zhuyifei1999 Could you reencode this file too at < 4.3 Gb target size?

Working on both. The download itself (scp) will take hours and the transcode can take longer. Hopefully it'll end up with a smaller file or I'll adjust the bitrate.

scp tools-trusty.wmflabs.org:'/data/project/videoconvert/public_html/server/php/files/2cb17a558201eaa223299cf7e19f74f3/Jeonbuk\ Memorial\ altar\ for\ 70th\ anniversary\ of\ the\ Jeju\ April\ 3rd\ Incident\ \(1\).mp4' 'Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (1).mp4'
scp tools-trusty.wmflabs.org:'/data/project/videoconvert/public_html/server/php/files/2cb17a558201eaa223299cf7e19f74f3/Jeonbuk\ Memorial\ altar\ for\ 70th\ anniversary\ of\ the\ Jeju\ April\ 3rd\ Incident\ \(2\).mp4' 'Jeonbuk Memorial altar for 70th anniversary of the Jeju April 3rd Incident (2).mp4'

Thanks.

@Goryeo Please be patient, as we need to transfer from and back your video and reencode it, to be in accordance with our current infrastructure limits.

@zhuyifei1999 Could you reencode this file too at < 4.3 Gb target size?

Note, that's 4 GiB exactly, which is the more common definition of what most people call a Gb if you're not using a Mac

Okay, I've fixed on the tasks and wikitech the 4.3 to a 4 limit, yes 2^n seems more mainstream than 1000^n.

Downloading done. Now attempting ffmpeg 2-pass encoding to VP9/Opus (from H.264/AAC) on the larger file of the two (the second one).

Attempting VP9 constant quality 19:

ffmpeg -y -i J2.mp4 -threads 0 -skip_threshold 0 -bufsize 6000k -rc_init_occupancy 4000 -qmin 19 -qmax 19 -vcodec libvpx-vp9 -f webm -an -pass 1 -passlogfile J2.log /dev/null

@Goryeo If using VP9 makes the file still too large, would you prefer:

  • Decrease the frame rate (it's currently at 60 fps), or
  • Decrease the resolution (currently at 1920x1080 a.k.a. 1080p), or
  • Decrease the 'quality' (probably more blurry)? We can go with either constant quality or constant bitrate here.

Doing 2nd pass. This will take a very long time because of VP9 ultra slowness (right now at 28 ffmpeg threads it's 0.3 frames encoded per second, and we have 71193 frames). Audio OPUS bitrate is set as the same as input AAC bitrate at 255 kb/s because of https://opus-codec.org/comparison/.

ffmpeg -y -i J2.mp4 -threads 0 -skip_threshold 0 -bufsize 6000k -rc_init_occupancy 4000 -qmin 19 -qmax 19 -vcodec libvpx-vp9 -f webm -ab 255000 -acodec libopus -pass 2 -passlogfile J2.log J2.webm

I will check the file size a few hours later and see if it will potentially fit in 4 GiB.

If the file size is 4.3 GB or less, can I upload it? If that's the case, is there only a way to do it, such as to send a hard disk, partition a file, and freeze the quality?

If you're unable to upload the files via terbium maybe the requestor should send a hard disk to the datacenter as explained here.

If the file size is 4.3 GB or less, can I upload it? If that's the case, is there only a way to do it, such as to send a hard disk, partition a file, and freeze the quality?

If you're unable to upload the files via terbium maybe the requestor should send a hard disk to the datacenter as explained here.

If the file is below 4GiB and is in a supported format (so not mp4), yes you may upload it with any tool that supports async chunked uploading, including bot not limited to UploadWizard, Rillke's big chunked upload, and video2commons. However since it is not below 4GiB there's no way we can upload it without making the file smaller.
Please ignore the instructions of making a hard drive, they are for uploading many many files that are individually smaller than 4GiB and as a whole huge; that is not the case here.

If the file is below 4GiB and is in a supported format (so not mp4), yes you may upload it with any tool that supports async chunked uploading, including bot not limited to UploadWizard, Rillke's big chunked upload, and video2commons. However since it is not below 4GiB there's no way we can upload it without making the file smaller.
Please ignore the instructions of making a hard drive, they are for uploading many many files that are individually smaller than 4GiB and as a whole huge; that is not the case here.

Dereckson said 4.29Gb or 4.3GB, but you are talking about 4GB, so the confusion comes. Where is the limit line?

I will check the file size a few hours later and see if it will potentially fit in 4 GiB.

After 2 hours, 18 minutes, and 25 seconds 2nd pass encoding, we are encoding at frame 2071 out of 71193, 00:00:35.12 out of 00:19:46.56, with a file size of 127541kB. Therefore for the complete file we would need:

  • 3 days, 7 hours, 18 minutes, 14 seconds of transcoding
  • 4384368kB = 4.18 GiB file size (from original 3.90GiB in mp4 format). That's a bit larger than the 4GiB threshold.

I'm afraid that at least one of T191572#4117915 has to be done.

Dereckson said 4.29Gb or 4.3GB, but you are talking about 4GB, so the confusion comes. Where is the limit line?

4.3 GB (gigabyte) = 4.3*10^9 bytes
4 GiB (gibibyte) = 4*2^30 bytes

4 GiB is the exact value of the maximum file size that can be uploaded to Commons. The exact value in gigabytes is 4.294967296 GB, which approximates to 4.3 GB.

Sorry, I used the pedantic/hard drive/Apple definition of gigabytes with 1 gigabyte = 10^9 bytes instead of the pragmatic/usual 1 gigabyte means 1 gibibytes. Let's use GB for GiB in the future to avoid such confusion.

I'm afraid that at least one of T191572#4117915 has to be done.

frame=18403 fps=0.3 q=0.0 size= 1039170kB time=00:05:06.84 bitrate=27743.5kbits/s dup=0 drop=2 speed=0.00422x

Or maybe not. Right now at frame 18403 it's using 1039170kB so the final video can be 3.83 GiB. I'll start the transcode of the smaller file after I finish lunch.

Or maybe not. Right now at frame 18403 it's using 1039170kB so the final video can be 3.83 GiB. I'll start the transcode of the smaller file after I finish lunch.

Doing: ffmpeg -y -i J1.mp4 -threads 0 -skip_threshold 0 -bufsize 6000k -rc_init_occupancy 4000 -qmin 19 -qmax 19 -vcodec libvpx-vp9 -f webm -an -pass 1 -passlogfile J1.log /dev/null

Starting 2nd pass with ffmpeg -y -i J1.mp4 -threads 0 -skip_threshold 0 -bufsize 6000k -rc_init_occupancy 4000 -qmin 19 -qmax 19 -vcodec libvpx-vp9 -f webm -ab 256k -acodec libopus -pass 2 -passlogfile J1.log J1.webm

Mentioned in SAL (#wikimedia-cloud) [2018-04-11T00:52:32Z] <zhuyifei1999_> set cpu affinity (gfg01$ taskset -p -c 0 29926; taskset -p -c 1 30181) on the main threads of two ffmpeg processes in an attempt to speed it up T191572

Mentioned in SAL (#wikimedia-cloud) [2018-04-11T01:26:30Z] <zhuyifei1999_> undo that. load went down (2.03 -> 2.01). probably not worth it when it's running in a hypervisor T191572

Starting 2nd pass with ffmpeg -y -i J1.mp4 -threads 0 -skip_threshold 0 -bufsize 6000k -rc_init_occupancy 4000 -qmin 19 -qmax 19 -vcodec libvpx-vp9 -f webm -ab 256k -acodec libopus -pass 2 -passlogfile J1.log J1.webm

I am trying. Could you please wait a minute if it ends first?
(note : Please forgive me for my awkward expression because I am using a google translator.)

I am trying.

What are you trying to do? The video is being transcoded on my end.

(note : Please forgive me for my awkward expression because I am using a google translator.)

It's okay. You can also provide the texts in your native language (Korean?) if you feel that what you are saying may be hard to translate. We can ask some Korean Wikimedians here.

@zhuyifei1999

What are you trying to do? The video is being transcoded on my end.

can you explain more clearly what you mean by 'on my end'

@Goryeo It means: He is currently transcoding the video at his computer. You do not have to do anything, just wait till @zhuyifei1999 and @Dereckson will solve the problem. Is it more clear now?

Urbanecm triaged this task as Medium priority.Apr 11 2018, 8:02 AM

Normal priority.

@Goryeo It means: He is currently transcoding the video at his computer. You do not have to do anything, just wait till @zhuyifei1999 and @Dereckson will solve the problem. Is it more clear now?

Since I have the original video, I am going to revise the "Quality" part of "Commons Video Convert" ask to upload the file again.

Just wait, please. @zhuyifei1999 is Video Convert administrator and he has original video as well because you uploaded it to the tool. He can amend convertion parameters such as quality manually, there's really no need for any action from your side.

Just wait, please. @zhuyifei1999 is Video Convert administrator

Just a clarification: I don't manage videoconvert tool. I have the file because I know how that tool works and the files are not technically hidden. (T191572#4117243)

Since I have the original video, I am going to revise the "Quality" part of "Commons Video Convert" ask to upload the file again.

I don't believe videoconvert is able to get the file below 4GiB without sacrificing too much of the quality. Quartering the size is too good.

Right now the larger video is encoding at frame 37795 (out of 71193, 53% done) and have a file size of 2267933kB, so the final video might be 4.07GiB.

Again: (CC: @revi can you help here?)

I'm afraid that at least one of T191572#4117915 has to be done.

@Goryeo Could you answer T191572#4117915?

(FWIW, the current encoding parameters use a constant video quality of 19, which is that tool v2c uses)

@zhuyifei1999

@Goryeo Could you answer T191572#4117915?

I do not know what to answer. Do you have to lower the quality of all three?

@Goryeo If using VP9 makes the file still too large, would you prefer:

  • Decrease the frame rate (it's currently at 60 fps), or
  • Decrease the resolution (currently at 1920x1080 a.k.a. 1080p), or
  • Decrease the 'quality' (probably more blurry)? We can go with either constant quality or constant bitrate here.

(기술적인 문제로 파일 용량은 약 4.3기가바이트 (4Gib)를 초과할 수 없습니다)

파일 인코딩 후에도 4.3 기가바이트를 넘으면 다음 선택지 중 하나를 골라야 합니다.

  • 프레임을 낮춘다 (지금 60fps)
  • 화질을 낮춘다 (지금 1080p)
  • videoconvert에서 파일 품질을 낮춘다

What I am asking is whether you want to decrease the frame rate, resolution, or quality. I would since the file is probably slightly larger than 4GiB, I would recommend the quality since the other two is more of massively decrease (like cut in half) the file size.

What I am asking is whether you want to decrease the frame rate, resolution, or quality. I would since the file is probably slightly larger than 4GiB, I would recommend the quality since the other two is more of massively decrease (like cut in half) the file size.

If the image quality is lowered, how much does it fall?
(만약 화질을 낮춘다면 어느 정도까지 낮아지는지 알 수 있을까요?)

If the image quality is lowered, how much does it fall?

Probably not noticeable. I'll generate snapshots of the video at 10 second using qualities 0-63. This may take a while to generate.

If the image quality is lowered, how much does it fall?

Probably not noticeable. I'll generate snapshots of the video at 10 second using qualities 0-63. This may take a while to generate.

If you are certain that it does not fall noticeably, I will choose "quality". I'm sorry to keep bothering you.
(눈에 띄게 떨어지지만 않는다면 "quality"을 선택하겠습니다.)

(I think I accidentally caused the previous transcodes to get OOM killed... oops, now moving the 'snapshots' generation to a separate server, encoding03 instead of gfg01, for more VCPUs)

Mentioned in SAL (#wikimedia-cloud) [2018-04-11T16:38:51Z] <zhuyifei1999_> depool v2ccelery on encoding01 as well for 'snapshot' generation T191572

I restarted the ffmpeg for the first (smaller) file on gfg01 without modification since it's mostly certain it will not hit 4GiB limit. I will get the command (probably a constrained quality setting with a bitrate upper limit at 28278kbit/s) for the second (larger) file after I finish lunch.

FWIW, the script for the 'snapshot' generation is:

for quality in {0..63}
do
nice ffmpeg -nostdin -hide_banner -y -i J2.mp4 -threads 0 -qmin $quality -qmax $quality -vcodec libvpx-vp9 -t 15 -f webm -an - | tee qualities/$quality.webm | ffmpeg -nostdin -hide_banner -y -i - -ss 00:00:10 -vframes 1 qualities/$quality.png &
sleep 1
done

Right now running 48..63 on encoding03 and 32..47 on encoding01. Will run 0..15 and 16..31 after this two batches finish.

List of webm quality (first 15 seconds) (0 is best, 63 is worst) vs size:

$ ls -vlh *.webm | awk '{ print $9, $5 }' | sed 's/.webm//'
0 488M
1 262M
2 204M
3 166M
4 133M
5 110M
6 98M
7 85M
8 77M
9 71M
10 66M
11 62M
12 58M
13 55M
14 52M
15 49M
16 46M
17 44M
18 42M
19 40M
20 38M
21 36M
22 35M
23 33M
24 32M
25 29M
26 26M
27 24M
28 22M
29 19M
30 17M
31 15M
32 13M
33 11M
34 9.4M
35 7.8M
36 6.6M
37 5.5M
38 4.6M
39 3.9M
40 3.2M
41 2.7M
42 2.3M
43 1.9M
44 1.6M
45 1.4M
46 1.2M
47 1005K
48 899K
49 797K
50 892K
51 793K
52 692K
53 620K
54 553K
55 486K
56 447K
57 395K
58 338K
59 307K
60 254K
61 221K
62 186K
63 144K

The extracted frames at the 10s mark has been uploaded to File:Jeonbuk_Memorial_altar_for_70th_anniversary_of_the_Jeju_April_3rd_Incident_(2).webm_ffmpeg_libvpx-vp9_qualities_0-63_at_10s.png. Feel free to check the appropriateness of the quality settings.

convert {0..7}.png +append horz0.png
convert {8..15}.png +append horz1.png
convert {16..23}.png +append horz2.png
convert {24..31}.png +append horz3.png
convert {32..39}.png +append horz4.png
convert {40..47}.png +append horz5.png
convert {48..55}.png +append horz6.png
convert {56..63}.png +append horz7.png
convert horz{0..7}.png -append +repage -virtual-pixel Black out.png

Back on track: starting constrained quality encoding on the larger file with maximum bit rate 28278kbit/s, quality hard limit range 14-24, and an aim of 19.

ffmpeg -y -i J2.mp4 -threads 0 -skip_threshold 0 -bufsize 6000k -rc_init_occupancy 4000 -qmin 14 -qmax 24 -crf 19 -b:v 28278k -vcodec libvpx-vp9 -f webm -an -pass 1 -passlogfile J2.log /dev/null

Starting 2nd pass with:

ffmpeg -y -i J2.mp4 -threads 0 -skip_threshold 0 -bufsize 6000k -rc_init_occupancy 4000 -qmin 14 -qmax 24 -crf 19 -b:v 28278k -vcodec libvpx-vp9 -f webm -ab 255k -acodec libopus -pass 2 -passlogfile J2.log J2_2.webm

The current average for above parameters is around 20000kbit/s (encoded 4:59.84, average = 19641kbits/s, current frame = 20128kbits/s). With that bitrate the file should end up around 2.8GiB. Watching a few seconds of the transcoded video via sshfs+ffplay I don't see bad video quality (but I did not compare with the source video).

Let's use GB for GiB in the future to avoid such confusion.

Um, no, "GB" is ambiguous, so "GB" should not be used at all (without clarification).

(Sorry for the late, tangential comment, but I didn't want that suggestion out there uncontested.)

The first transcode should be ready by tomorrow this time. The second might take a day longer.

Mentioned in SAL (#wikimedia-cloud) [2018-04-13T20:01:31Z] <zhuyifei1999_> bind mount /srv/zhuyifei1999 /srv/v2c/ssu on encoding01 so I can copy the videos to a publicly-accessible (v2c.wmflabs.org) instance local storage rather than the slooowww NFS T191572

@Dereckson https://v2c.wmflabs.org/J1.webm 3.3GiB sha1sum = 63888c9e0b2e913a90305548dc5d02e6d17fc3cf

frame=60382 fps=0.2 q=0.0 Lsize= 3415514kB time=00:16:46.36 bitrate=27802.9kbits/s dup=0 drop=10 speed=0.004x     
video:3384620kB audio:30073kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.024073%

For future reference:

$ ffprobe -hide_banner J1.mp4 
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'J1.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: isommp42
    creation_time   : 2018-04-04 06:36:56
    location        : +35.8139+127.1483/
    location-eng    : +35.8139+127.1483/
    com.android.version: 7.0
  Duration: 00:16:46.35, start: 0.000000, bitrate: 28268 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080, 28005 kb/s, SAR 1:1 DAR 16:9, 60.01 fps, 60 tbr, 90k tbn, 180k tbc (default)
    Metadata:
      creation_time   : 2018-04-04 06:36:56
      handler_name    : VideoHandle
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 256 kb/s (default)
    Metadata:
      creation_time   : 2018-04-04 06:36:56
      handler_name    : SoundHandle
$ ffprobe -hide_banner J1.webm
Input #0, matroska,webm, from 'J1.webm':
  Metadata:
    encoder         : Lavf57.25.100
  Duration: 00:16:46.37, start: 0.007000, bitrate: 27802 kb/s
    Stream #0:0(eng): Video: vp9 (Profile 0), yuv420p(tv), 1920x1080, SAR 1:1 DAR 16:9, 60 fps, 60 tbr, 1k tbn, 1k tbc (default)
    Stream #0:1(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
$ ls -l J1.mp4 J1.webm
-rw-r--r-- 1 zhuyifei1999 wikidev 3555981415 Apr  9 17:26 J1.mp4
-rw-r--r-- 1 zhuyifei1999 wikidev 3497486751 Apr 14 14:26 J1.webm
Dereckson closed this task as Resolved.EditedApr 14 2018, 4:37 PM

Successfully uploaded at https://commons.wikimedia.org/wiki/File:Jeonbuk_Memorial_altar_for_70th_anniversary_of_the_Jeju_April_3rd_Incident_(1).webm

Thanks @Goryeo to have submitted this video and for your patience, thanks @zhuyifei1999 for your great video transcode support, it's appreciated.

(I'm still doing the second video though)

The second one has a dedicated task: T191610

Okay then, thanks for clarification.