Page MenuHomePhabricator

Reduce TimedMediaHandler VP9 transcode resolution steps
Open, Needs TriagePublic

Description

Currently we produce WebM VP9 transcodes in 240p, 360p, 480p, 720p, and 1080p. (1440p and 2160p were added, but disabled because they were too slow.)

Consider reducing the number of active steps to just 3 active and a reserved slot:

  • "small" (240p)
  • "SD" (480p)
  • "HD" (1080p)
  • "4K" (2160p reserved for future use, eg may require admin approval to enable on a file)

The system won't scale up the videos, so 576p or 720p files won't scale up to 1080p, they'll have their native resolution in the 'HD' slot. Likewise, a 360p source will sit in the "SD" slot without getting scaled up.

Details

Event Timeline

Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptDec 17 2025, 11:30 PM

While we are here, can we just drop vp8 too? The name I think is "WebM" which also confuses me a lot

VP8 is currently used for older Mac , iOS and Android devices that do not yet support hardware VP9, but do support hardware VP8.
mjpeg is used for devices that support neither (even older iOS and macOS).

I'm not entirely sure about how much older Android devices are out there that do not yet support hardware VP8/VP9 decoding. it seems however that even the Galaxy A25 from 2023 did NOT support hardware VP9 (even though the SOC it uses does support it), so i'm pretty sure it's still pretty substantial. Whether we can cover that with MJPEG support is hard to say without testing I think. it's not really a common format.

The name I think is "WebM" which also confuses me a lot

Yeah in hindsight, it should have been named WebM/VP8 and WebM/VP9 respectively, with keys .vp8.webm and .vp9.webm. Things are always a combination of fileformat and codec, but most people don't really care about such details, so we sometimes leave them out.

Thanks for the context. One would assume since vp8/vp9 is a google thing, it would be better supported in android. I can try to look into webrequest logs to see how it's used but before that, here is another somewhat crazy idea: Why not transcoding to H.264 as the fallback (a version of it that doesn't have any patents)? And getting rid of vp8 and mjpeg? (which I hope would pave the path to phase out vp9 in favor of av1 in a couple of years while keeping H.264 as fallback for basically forever) Obviously this should be done after sign off from legal but I think there aren't many active consumer device on earth that wouldn't have H.264 support :D. I know we can't allow uploading of MP4 files since among many things we need to detect whether the file is H.265 or anything else we wouldn't allow but what's stopping us from providing it?

Change #1222827 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[operations/mediawiki-config@master] Reduce VP9 transcode resolution steps

https://gerrit.wikimedia.org/r/1222827

Change #1222827 merged by jenkins-bot:

[operations/mediawiki-config@master] Reduce VP9 transcode resolution steps

https://gerrit.wikimedia.org/r/1222827

Mentioned in SAL (#wikimedia-operations) [2026-01-06T12:10:19Z] <ladsgroup@deploy2002> Started scap sync-world: Backport for [[gerrit:1222827|Reduce VP9 transcode resolution steps (T413031)]]

Mentioned in SAL (#wikimedia-operations) [2026-01-06T12:12:47Z] <ladsgroup@deploy2002> ladsgroup: Backport for [[gerrit:1222827|Reduce VP9 transcode resolution steps (T413031)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-01-06T12:18:21Z] <ladsgroup@deploy2002> Finished scap sync-world: Backport for [[gerrit:1222827|Reduce VP9 transcode resolution steps (T413031)]] (duration: 08m 02s)

Why not transcoding to H.264 as the fallback (a version of it that doesn't have any patents)?

While the majority of patents are expired by now, there's still a few which were filed later and are unavoidable and TTBOMK it will take several more years until we would be able to do that :-/

But AV1 is free of these issues and as brainstormed is Lisbon 2026 might finally be the year to also transcode to AV1 as well. I'm planning to run some comparison encodes of ffmpeg/libvpx vs ffmpeg/svtav1 later the month so that we have a sense of how much additional transcoding time we're looking at.

Why not transcoding to H.264 as the fallback (a version of it that doesn't have any patents)?

While the majority of patents are expired by now, there's still a few which were filed later and are unavoidable and TTBOMK it will take several more years until we would be able to do that :-/

Yup, I went down a rabbit hole afterwards and reached the same conclusion :(

But AV1 is free of these issues and as brainstormed is Lisbon 2026 might finally be the year to also transcode to AV1 as well. I'm planning to run some comparison encodes of ffmpeg/libvpx vs ffmpeg/svtav1 later the month so that we have a sense of how much additional transcoding time we're looking at.

So my train of thought is that we need two types of transcoded files:

  • Main and good version (better name is welcome) to serve: Currently it's VP9. We transcode to different resolutions so people can pick based on bandwidth (and hopefully soon automatically via MPEG-DASH)
  • Fallback for old devices (or many of apple products ๐Ÿ˜’): We serve only one small resolution so people can still see the video.

So for the former, I'm all for slowly switching to the next evolution of VP9 which is AV1 but my point here is about the latter, what fallbacks we can or should provide. H.264 is the natural fallback and could be the universal fallback for us (instead of us providing both VP8 and MJPEG) which is one less file to transcode and has much better support than the other two combined (since this is the goal of these transcodes) but as you said, patents :(

I need to write some important code so naturally I decide to procrastinate. Iโ€Œ looked at the number of different transcodings being requested. Since the URIโ€Œ path is different between different transcodings, the resulting table is weird but you get the idea:

spark-sql (default)> select reverse(split(reverse(uri_path), '[.]')[1]) as encoding, reverse(split(reverse(uri_path), '[.]')[2]) as transcode_size, count(*) as hitcount from wmf.webrequest where year = 2026 and month = 1 and day = 10 and uri_path like '%/transcoded/%' and webrequest_source = 'upload' and content_type like 'video%' and http_status = 200 group by encoding, transcode_size order by hitcount desc limit 50;
encoding        transcode_size  hitcount
vp9     480p    754204
vp9     1080p   91851
360p    ogv     76654
vp9     240p    62416
360p    webm    40899
vp9     720p    13485
mjpeg   144p    11446
vp9     360p    10203
360p    ogg     3230
360p    mpg     132
vp9     1440p   124
vp9     2160p   94
*more weird uris*
Time taken: 54.366 seconds, Fetched 21 row(s)

One interesting thing is that if Iโ€Œ remove the 200 response (so include 404s and 429s). You get a completely different picture:

spark-sql (default)> select reverse(split(reverse(uri_path), '[.]')[1]) as encoding, reverse(split(reverse(uri_path), '[.]')[2]) as transcode_size, count(*) as hitcount from wmf.webrequest where year = 2026 and month = 1 and day = 10 and uri_path like '%/transcoded/%' and webrequest_source = 'upload' and content_type like 'video%' group by encoding, transcode_size order by hitcount desc limit 50;
encoding        transcode_size  hitcount
mjpeg   144p    5069714
vp9     480p    1156755
vp9     1080p   323177
360p    ogv     154835
vp9     240p    109732
360p    webm    82564
vp9     720p    47575
vp9     360p    16152
360p    ogg     6025
360p    mpg     295
vp9     1440p   131
*other weird uris*
Time taken: 52.656 seconds, Fetched 21 row(s)

Are we accidentally blocking mjpegs? if this is just crawlers and blocks are valid, then mjpeg is responsible only for >1%โ€Œ of video plays. Maybe we can drop it? VP8 is at least 4% (4 times more!) so it makes sense to keep it.

Ah, virtually all of the requests for mjepg don't have any referrers and the UA is "AppleCoreMedia/1.0.0.xxx (iPhone; U; CPU OS 26_2 like Mac OS X; en_us)" and similar UA parsers say this is "Apple podcast app" which Iโ€Œ have no idea what it is. Could be a crawler/scraper with UA

If you exclude requests that dont' have referrer, we are back to around 100K reqs a day (1%). Sorry for the red herring.

Ah, virtually all of the requests for mjepg don't have any referrers and the UA is "AppleCoreMedia/1.0.0.xxx (iPhone; U; CPU OS 26_2 like Mac OS X; en_us)" and similar UA parsers say this is "Apple podcast app" which Iโ€Œ have no idea what it is. Could be a crawler/scraper with UA

Itโ€™s the native playback framework for iOS apps (AVKit). Could even be be our own app, if we do not set AVURLAssetHTTPUserAgentKey. Seeing 26.2 there is surprising, though I do remember Brooke saying that Apple managed to break support recently for one of the VP codecs, could be related.

The thing is that I got my hands on a recent iPhone and tested it and it was playing and preferring VP9 so it works in some circumstances at least. Iโ€Œ couldn't check the UA but I'd be very surprised if the referrer is stripped out. It doesn't make sense.

IIRC here's the breakdown you should see on Apple:

Mobile devices:

  • iOS/iPadOS 17.4 or later: WebM VP9 if there is a hardware VP9 codec in the device, otherwise WebM VP8
  • iOS/iPadOS before 17.4: MJPEG .mov

The iOS WebM support transition point is relatively recent, but IIRC our stats show the vast majority of traffic on newer devices that grok the WebMs. We ought to accommodate older devices if we can IMHO but if we must make sacrifices I'm happy to have a less exotic fallback.

Desktops:

  • macOS+Safari since a version transition that I don't recall: WebM VP9 if there is a hardware VP9 codec, otherwise WebM VP8
  • macOS+Safari before that transition (which was years earlier than the iOS/iPadOS transition): MJPEG .mov

I don't think it's worth worrying about desktop Safari too old to read WebM VP8 as this was some years ago, I'd have to go back and check to confirm where the transition point is if we care about it exactly.

I chose the MJPEG .mov fallback because we didn't have legal clearance on H.264, which remains patented, though I've proposed MPEG-4 Visual (the older part 2 version, not AVC/H.264) at T358266 as an alternative that may be cleaner to legal. However, most *other* browsers don't accept MPEG-4 Visual and only play H.264 or later.

If we get clearance to use H.264 we can use that to replace both the MJPEG and the VP8.

As far as AV1, I think we should *definitely* look at expanding to it at least for aggressively-compressed transcodes for highly-viewed pages like I propose in T414988. But AV1 has less compatibility coverage than VP9 still.

(I haven't yet been able to test an Apple mobile device with hardware AV1 codec yet -- <s>I should probably requisition one.</s> I am in the process of requisitioning an iPhone 17 which should have the AV1 codec built in for testing.)

IIRC here's the breakdown you should see on Apple:

Mobile devices:

  • iOS/iPadOS 17.4 or later: WebM VP9 if there is a hardware VP9 codec in the device, otherwise WebM VP8
  • iOS/iPadOS before 17.4: MJPEG .mov

The thing is that it is not the case unless people are faking UA or something like that, we are seeing a lot of requests to 144px MJPEGโ€Œ on iOS 26.2: https://w.wiki/HaDe

The thing is that it is not the case unless people are faking UA or something like that, we are seeing a lot of requests to 144px MJPEGโ€Œ on iOS 26.2: https://w.wiki/HaDe

Yeah something's weird, and it bears investigating. The majority of the mysterious MJPEG hits however are in third-party iOS apps, so god only knows what they are or what they're doing.

I can test more thoroughly in our own app and in Safari next week when I'm back home and have access to my test devices. Give me a ping Wednesday if I haven't added anything by then :D

Is mjpeg perhaps the first entry in the sources list ?

For *cough* Debbie, the sources list seems to list VP9, VP8, MJPEG, and the original Ogg source file in that order. iOS 17.4 or higher's in-WebKit <video> *should* happily handle the WebM if received in that order (it seemed to in my last set of testing at least).

I'll run more thorough tests once I'm home.

It's also conceivable we're doing something clever with the order of courses for sizing purposes -- the VP9 and VP8 will be at 240p (from the original source file) while the MJPEG is 144p. If it's used in a small thumbnail it's *conceivable* our ordering code isn't smart here :D

Why do you guys wanna reduce "steps" (options, i.e.) of VP9? Do you think many people have enough bandwidth to go further than "360p"? What if they don't have enough bandwidth for "480p"?

Highest priority: 1) reducing the number of outputs that scrapers might simultaneously attempt to download
Medium priority: 2) reducing the time spent generating transcode output
Very low priority: 3) reducing the disk space requirements of transcode output

Once we a) have adaptive streaming (WE DO NOT YET!) and b) instrument the adaptive streaming player so we can actually reason about user bandwidth (which is not currently taken into account in any way by the player!) we can think about re-adding resolution steps to the DASH output if we find that produces better outcomes for users.

Highest priority: 1) reducing the number of outputs that scrapers might simultaneously attempt to download

Into especially their phones? What about "attempt to stream (or buffer)"? Oh wait... How many streams and downloads total so far?

Medium priority: 2) reducing the time spent generating transcode output

Isn't a higher-speed internet needed to quickly or speedily generate the output? Oops... rhetorical question... methinks.

Once we a) have adaptive streaming (WE DO NOT YET!) and b) instrument the adaptive streaming player so we can actually reason about user bandwidth (which is not currently taken into account in any way by the player!) we can think about re-adding resolution steps to the DASH output if we find that produces better outcomes for users.

Isn't this years or centuries away?

Into especially their phones? What about "attempt to stream (or buffer)"? Oh wait... How many streams and downloads total so far?

I'm sorry I don't understand what you mean.

Isn't a higher-speed internet needed to quickly or speedily generate the output? Oops... rhetorical question... methinks.

No, internet speed is unrelated to the time spent on output.

Isn't this years or centuries away?

We're talking seriously about working on this this year to improve well-known server performance problems with large videos.

I'm sorry I don't understand what you mean.

I'll rephrase: Aren't those wanting to just stream a video, long or short, excluded from this? Or, what else do you mean when you mentioned "scrapers [who] might simultaneously attempt to download", especially if that's not the case?

We're talking seriously about working on this this year to improve well-known server performance problems with large videos.

Oh... I'm just thinking about how reducing the "steps", like 360p and 720p, would affect low-income users and those of lower-middle class, especially when they have a cheaper internet option they can afford. Never have I thought about large videos... very much.

I'll rephrase: Aren't those wanting to just stream a video, long or short, excluded from this? Or, what else do you mean when you mentioned "scrapers [who] might simultaneously attempt to download", especially if that's not the case?

Oh I see! This isn't anything to do with playback by users:

Scrapers are tools that AI companies and search engines and other people use to "crawl" or "scrape" the web, downloading as much data as they can get their hands on. Many of these tools are operated without regard for how they impact the services they scrape, and may quickly download many thousands of resources in a row, many of them simultaneously.

This has always been a huge problem for web sites like Wikipedia, and it's getting much, much, MUCH worse because of unscrupulous "AI" companies. They're costing us a lot of time and money, and we have to stem the losses.

Oh... I'm just thinking about how reducing the "steps", like 360p and 720p, would affect low-income users and those of lower-middle class, especially when they have a cheaper internet option they can afford. Never have I thought about large videos... very much.

The user-visible change would be that sometimes when a page would've shown a 360p or 720p video in a particular case, it will show a smaller 240p or 480p video instead. This will reduce the bandwidth usage on those cases, while also reducing the visual quality somewhat.

In some other cases (where the original was 360p) it may the same size video as before but labeled as the next size up ("480p" transcode slot that holds the 360p source data without scaling it up). These cases will have at least as good quality as before, but may use more bandwidth because they're configured for a larger transcode slot..

IIRC the video player popup is sized such that you usually get at most 480p actually playing back unless you select a higher resolution manually.

Rolling forward to the future once we push out the DASH-based adaptive streaming: after we have the ability for the player to see that people have more bandwidth available during their viewing and can upgrade to the higher resolutions. If that works cleanly, then we'll want to look at the relative perceived quality vs actual bandwidth usage of the 240p/480p/1080p transcode targets in adaptive streaming, and it may make sense to add 360p and 720p back in.