Page MenuHomePhabricator

[Epic] Determine a strategy to store files between 5 and 100 GB
Open, Needs TriagePublic

Description

More and more, we're going to encounter requests to store very large videos in lossless format. Those videos will be huge, we already received requests for 16 GB and 18 GB files from @Goryeo.

Current Wikimedia infrastructure uses swift as a storage backend, but without support for large object.

MediaWiki isn't equipped either to send large objects to swift, as there is a need to divide the file in several parts < 5 GB each.

We need to determine a strategy at infrastructure and MediaWiki levels to store those large files.

References:

Event Timeline

Dereckson renamed this task from Determine if/how we can store file size greater than 32 bits to {Epic] Determine a strategy to store files between 5 and 100 Gb.Apr 9 2018, 2:59 PM
Dereckson removed Dereckson as the assignee of this task.
Dereckson updated the task description. (Show Details)
Dereckson added subscribers: aaron, fgiunchedi.
revi renamed this task from {Epic] Determine a strategy to store files between 5 and 100 Gb to [Epic] Determine a strategy to store files between 5 and 100 Gb.Apr 12 2018, 7:52 AM
revi subscribed.

I think it'd be good to do this. Commons is aimed to provide high quality educational multimedia. The more quality of our resources, the more useful they'll be. Thanks.

This is related to T149847 in that we would *have* to stop moving file content around in Special:MovePage just to rename files.

AlexisJazz added subscribers: AlexisJazz, C.Suthorn.

With 4K video becoming more and more commonplace, 4GB isn't always enough. https://commons.wikimedia.org/wiki/File:Politparade.webm and https://commons.wikimedia.org/wiki/File:Genderwahn.webm are encoded to fit within MediaWiki's limit, which isn't a good sign. And 9 Mbps for 4K.. Well.. Could be worse, but.. not ideal. And they lack 1440p and 2160p transcodes, presumably due to the file size limit.

And that's one hour of video. Better don't make anything longer.. I recently uploaded https://commons.wikimedia.org/wiki/File:How_to_de-package_and_expose_a_GPU_flip_chip_die.webm from YouTube. 10 minutes, 1.07GB. So within 40 minutes this would go over 4GB. And 8K is even worse: https://commons.wikimedia.org/wiki/File:First_8K_Video_from_Space_-_Ultra_HD_VP9.webm (21.75 Mbps) and https://commons.wikimedia.org/wiki/File:Ghost_Towns_in_8K_GoPro_be_Hero.webm. (33.7 Mbps)

Bugreporter renamed this task from [Epic] Determine a strategy to store files between 5 and 100 Gb to [Epic] Determine a strategy to store files between 5 and 100 GB.Oct 15 2023, 3:04 PM
Bugreporter updated the task description. (Show Details)

Personally, I think 5 GiB is plenty. Our purpose is education, not entertainment. We don't need 8K videos to explain how mitochondria work. 480p works fine. 1080p is probably overkill. And 4320p (8K) is just totally unnecessary. What use case is excluded by requiring that people downsize giant video files?

Being able to show freely licensed educational content on big screens is "not entertainment". You're still free to screen 480p in your cinema if your audience enjoys it, despite of a "totally unnecessary" 4320p version potentially also existing. :)

Personally, I think 5 GiB is plenty. Our purpose is education, not entertainment. We don't need 8K videos to explain how mitochondria work. 480p works fine. 1080p is probably overkill. And 4320p (8K) is just totally unnecessary. What use case is excluded by requiring that people downsize giant video files?

My use case for publishing 8K (and in future 9K, 10K or 11K) videos is the possibility to make video captures and crops of these and still end ap with for example a portrait of Greta Thunberg in a reasonable resolution for a jpeg. Also I make videos of events that will may be only become interesting to historians in a hundred years, when i am long dead. I have this documents in 8K and i can now upload them in high quality, but this would be lost,, if i uploaded only 480p. But I am all for deleting this videos of mitochandria, should they ever be needed you can always make new one's, it is not as if mitochondria have changed a lot since cinema was invented 47 years ago.

add: TIFFs and PDFs (and sometimes even PNGs) are bigger than 4GiB too.

https://archive.org/details/Scotichronicon was extracted from 4K video. The text is readable but not particularly sharp. This is the only digital version of the Scotichronicon in existence as far as I know, and it exists by virtue of CGP Grey including a time-lapse in one of his videos where he shows himself turning the pages as he searches the book for a name. If the original video had been 8K120 there would be fewer missing pages and the text would be crisper. If it had been 1080p.. woof.

https://commons.wikimedia.org/wiki/File:Doria_Ragland_VOA.jpg was extracted from a 720p video. Clearly 480p would have been more than enough. https://commons.wikimedia.org/wiki/File:Bryn_Kenney_2015.jpg was extracted from a "probably overkill" 1080p source. I wouldn't call that overkill, and that's the best image we have of him.

add: TIFFs and PDFs (and sometimes even PNGs) are bigger than 4GiB too.

It should be noted, that many (not all) of the tiff cases are due to using no lossless compression.