Page MenuHomePhabricator

Generate JPG thumbnails with ImageMagick + mozjpeg instead of IM alone
Closed, DeclinedPublic

Description

The weight of JPEG thumbnails, while varying less than PNG thumbnails', is often a relevant factor in the loading times observed on random Wikipedia pages, even with a fast connection. It may be even more important for MediaWikis using larger thumbnails.

https://github.com/mozilla/mozjpeg 2 was just released, see https://blog.mozilla.org/research/2014/03/05/introducing-the-mozjpeg-project/ and http://thenextweb.com/insider/2014/07/15/mozilla-releases-mozjpeg-2-0-facebook-tests-backs-jpeg-encoder-60000-donation/ for introduction, and should be relatively trivial to implement as "second pass" over JPEG thumbnails produced, right?

I know that VipsScaler is currently only used by Wikimedia wikis for PNGs (and that it's named after Vips), but would such an optimisation be in scope for this extension? If not, should we instead upstream a request to include a jpgcrush-like functionality in Vips; or use some other way?


Version: master
Severity: enhancement

Details

Reference
bz68145

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 3:39 AM
bzimport set Reference to bz68145.
bzimport added a subscriber: Unknown Object (MLST).
Nemo_bis created this task.Jul 17 2014, 8:51 AM

A second pass over existing jpgs would be quite destructive and introduce compression artifacts. This is only worth considering as a swap-in replacement for imagemagick when we want to generate jpgs. One point that the github page and the announcements make no mention od, however, is which input formats are possible. The sample images on github are bmp, jpg, ppm and txt. In that area I have a feeling that imagemagick won't be replaceable. Anyway, I think it should rather go in core, where the conversion to jpgs happens with the various encoders? Definitely worth exploring!

(In reply to Gilles Dubuc from comment #1)

A second pass over existing jpgs would be quite destructive and introduce
compression artifacts.

Are you just giving this for granted, or is the consideration specifically based on tests and/or the algorithm they follow? (I didn't inspect either.)

Are you just giving this for granted, or is the consideration specifically
based on tests and/or the algorithm they follow? (I didn't inspect either.)

Any jpeg recompression is lossy, theirs included. Depending on the quality settings, you can get away with a few extra passes without necessarily causing a visual difference that people will notice, but if there's a straightforward way to avoid an extra pass, we should do it. I don't see the point of having an initial imagemagick conversion to jpg if it's going to be immediately followed by a mozjpeg pass, they both achieve the same goal. Ideally someone will hook up mozjpeg into imagemagick directly and it'll be one imagemagick option away.

(In reply to Gilles Dubuc from comment #3)

if there's a
straightforward way to avoid an extra pass

I didn't hear of any.

I don't see the
point of having an initial imagemagick conversion to jpg

Sure, that could be skipped, using Vips instead; that's why I filed this in VipsScaler.\

if it's going to be
immediately followed by a mozjpeg pass, they both achieve the same goal.

The goal of our "convert" commands is rescaling.

Ideally someone will hook up mozjpeg into imagemagick directly and it'll be
one imagemagick option away.

Sure, that may take years. Which again is why I filed this in VipsScaler, created to work around some limitations of imagemagick in specific areas. :-)

Imagemagick's convert can do the rescaling/sharpening/color profile handling, while targeting an uncompressed format, and then mozjpeg would take care of the JPEG compression.

Just to clarify, Vips is an image scaling program, just like image magick. The main differences between vips and image magick is that image magick has more options/is more flexible, and vips has better performance characteristics (particularly memory usage on large files in some formats).

Gave mozjpeg a spin, based on production IM parameters:

convert obama.jpg -background white -define jpeg:size=220x275 -thumbnail 220x275! -depth 8 -sharpen 0x0.8 -rotate -0 pnm: | mozcjpeg -quality 80 > obama-mozjpeg.jpg

Versus

convert obama.jpg -background white -define jpeg:size=220x275 -thumbnail 220x275! -depth 8 -sharpen 0x0.8 -rotate -0 -quality 80 obama-imagemagick.jpg

The mozjpeg image is 15kb, the IM one 25kb. The mozjpeg image appears to be slightly softer, although it's hard to tell whether it's noise being removed or actual signal. There is clear noise/color distortion found on the IM one that is gone from the mozjpeg one.

I'm going to double-check that the size savings aren't due to metadata stripping differences, if not, the size gain looks pretty amazing. I'll also try to see if I can make the sharpening match more between the two. Difference in sharpening is what killed my attempt at thumbnail chaining last year, some of the Commons community is very sensitive to that.

Gilles added a comment.EditedJun 1 2015, 3:23 PM

As I suspected, the IM one contains the sRGB profile, the mozjpeg one doesn't have any. Adding the sRGB profile back into the mozjpeg one with exiftool, we're now looking at 25kb vs 18kb. That's still a whopping 28% lighter.

Gilles renamed this task from Apply jpgcrush (mozjpeg) over all thumbnails to Generate JPG thumbnails with ImageMagick + mozjpeg instead of IM alone.Jun 2 2015, 8:28 AM
Gilles raised the priority of this task from Low to Normal.
Gilles set Security to None.
Gilles closed this task as Declined.Jun 2 2015, 8:59 AM

In my quest to reproduce the same level of sharpness, mozjpeg turned out not to be as great as it first appeared to be.

Cranking up the sharpening parameter in the IM command doesn't help, mozjpeg at quality 80 always yields the same softer result than pure IM. Increasing the mozjpeg quality parameter, however, allows us to be on par with IM in terms of sharpness. A mozjpeg quality of 90 is what gets us roughly the same quality. However, it also gets us the same file size...

Long story short, 80 quality in IM corresponds to 90 in mozjpeg in terms of sharpness, and the file size becomes equal, meaning that mozjpeg 3.0 provides no file size advantage over IM in the images I tested. I don't think that mozjpeg is worth pursuing at this time. If we were to serve softer JPG thumbnails just for the sake of using mozjpeg at quality 80, we might as well just lower the current quality value passed to IM and achieve the same result of lighter, worse quality thumbnails.

For reference, test image with IM at 80:

Mozjpeg at 90:

And mozjpeg at 80:

You have to look at the original size and really squint at the details to see the sharpness differences at 80 vs 80, but they're there. Commons contributors would definitely notice and dislike it.

Tgr added a comment.Jun 2 2015, 5:37 PM

Worth a bug report to the mozjpeg project, maybe? We might be a large/interesting enough use case to motivate them to improve sharpening, if they know what direction they need to improve it in.

I don't think this is a sharpening issue, but really a quality one. The apparent softness is a side-effect of the extra artifacts. I don't find it that surprising that a given quality level in tool A corresponds to another one in tool B.

we might as well just lower the current quality value passed to IM and achieve the same result

Is this tracked somewhere? It's worth remembering, or noting on the RfC for thumb quality.

This comment was removed by Nemo_bis.

It's tracked right here :) Which RFC are you referring to? Wikipedia zero already implemented the ability to lower the quality of the thumbnails on demand, which the mobile site could reuse we wish to do so.

A closed task is not a very good way to track an open todo. :)
I thought that RfC (or another) was more generic about reconsidering thumbnails quality.

I never said that lowering the quality across the board was worth considering. The current IM quality of 80 looks reasonable to me. Maybe mobile web could serve a lower quality, but even that might be a very unpopular suggestion among active Commons editors. I certainly don't feel the urge to explore that question. I'm happy to look into technological solutions to making images smaller at the same image quality, which was the point of this task, but looking into big compromises like sacrificing image quality for speed doesn't seem like a simple decision to me. It's debate material and I'm not convinced enough that it's worth it to start the discussion myself.