Page MenuHomePhabricator

Investigate re-enabling thumbnail chaining with a single large reference thumbnail
Closed, ResolvedPublic

Assigned To
Authored By
Gilles
Jul 10 2015, 11:15 AM
Referenced Files
F191705: resharpened.png
Jul 13 2015, 8:44 AM
F191703: direct.png
Jul 13 2015, 8:44 AM
F191689: difference-frombucket.png
Jul 13 2015, 7:11 AM
F191683: difference-frombucket.png
Jul 13 2015, 7:11 AM
F191679: difference-statusquo.png
Jul 13 2015, 7:11 AM
F191685: difference-statusquo.png
Jul 13 2015, 7:11 AM
F191681: difference-resharpened.png
Jul 13 2015, 7:11 AM
F191687: difference-resharpened.png
Jul 13 2015, 7:11 AM

Description

Thumbnail chaining had to be turned off because of over-sharpening issues: T76983

This was due to the way the sharpening calculation works, which resulted in the smallest thumbnails going through more than one sharpening pass, when the status quo only sharpened once. If instead of chaining every single thumbnail size based on the next bigger one, we only used a single large reference thumbnail, we might be able to keep the sharpening pass the same as it currently is. To be verified first by doing the math.

Event Timeline

Gilles claimed this task.
Gilles raised the priority of this task from to Medium.
Gilles updated the task description. (Show Details)
Gilles added a project: Performance-Team.
Gilles subscribed.

The calculation that governs sharpening (with the default values) is the following:

Sharpening is applied if the target thumbnail size is less than 0.85 the size of the original. I.e. if the original is a 4096px-wide image, only thumbnails whose size is less than 3481px wide will be sharpened.

This seems excessive when dealing with large image-to-large image. Sharpening is only really useful for small sizes where edges would otherwise visually be "lost".

If with the current code we picked 2880 as the reference thumbnail size (biggest Media Viewer bucket), originals whose width is greater than 3388 would still see 2 passes of sharpening for thumbnail sizes < 2448, which is basically all buckets for Media Viewer and all default thumbnail sizes.

The question is: does sharpening from a > 3388 image to 2880 have a true impact, once the 2880 image is resized down to below 2448?

I will play with DSSIM to answer that question next, to see if suppressing the sharpening for the reference large thumbnail (which will require special-casing things in the code) has a significant quality upside.

Gilles renamed this task from Re-enable thumbnail chaining with a single large reference thumbnail to Investigate re-enabling thumbnail chaining with a single large reference thumbnail.Jul 10 2015, 11:32 AM
Gilles set Security to None.

Running the following:

convert -quality 80 -background white -define jpeg:size=2880x910 largeoriginal.jpg -thumbnail 2880x910! -depth 8 -sharpen 0x0.4 -rotate -0 2880px-sharpened.jpg

convert -quality 80 -background white -define jpeg:size=2880x910 largeoriginal.jpg -thumbnail 2880x910! -depth 8 -rotate -0 2880px-intact.jpg

convert -quality 80 -background white -define jpeg:size=220x70 largeoriginal.jpg -thumbnail 220x70! -depth 8 -sharpen 0x0.4 -rotate -0 220px-direct.jpg

convert -quality 80 -background white -define jpeg:size=220x70 2880px-sharpened.jpg -thumbnail 220x70! -depth 8 -sharpen 0x0.4 -rotate -0 220px-resharpened.jpg

convert -quality 80 -background white -define jpeg:size=220x70 2880px-intact.jpg -thumbnail 220x70! -depth 8 -sharpen 0x0.4 -rotate -0 220px-frombucket.jpg

convert 220px-direct.jpg 220px-direct.png

convert 220px-resharpened.jpg 220px-resharpened.png

convert 220px-frombucket.jpg 220px-frombucket.png

./dssim -o difference-resharpened.png 220px-direct.png 220px-resharpened.png
./dssim -o difference-frombucket.png 220px-direct.png 220px-frombucket.jpg

I end up with a DSSIM score of 0.001586 for the 220px sharpened twice and the following difference image:

difference-resharpened.png (70×220 px, 18 KB)

And a DSSIM score of 0.001585 for the 220px sharpened once, with that difference image:

difference-frombucket.png (70×220 px, 18 KB)

They are indistinguishable, imho.

I'll now check with the biggest Media Viewer bucket that would get resharpened (1920) to see if the results are the same.

For 1920, the DSSIM score of the thumbnail sharpened twice is 0.004668 and here's the difference image:

difference-resharpened.png (607×1 px, 1 MB)

The DSSIM score of the thumbnail sharpened once is 0.004633 and here's its difference image:

difference-frombucket.png (607×1 px, 1 MB)

In this case it seems like the added sharpening of the 2880 reference has no effect either.

I think we should be safe sharpening-wise to reenable thumbnail chaining with a single reference thumbnail of 2880 for images bigger than 2880 + the minimum distance.

I forgot to compare the DSSIM of the status quo thumbnail and the chained thumbnails to a lossless resize, to put some perspective on the scores for both chained scenarios. I'll do that next.

For 1920, DSSIM score of status quo is 0.001943, DSSIM score of resharpened is 0.004592.

Difference images (status quo first):

difference-statusquo.png (607×1 px, 1 MB)

difference-resharpened.png (607×1 px, 1 MB)

For 220, DSSIM score of status quo is 0.001852, DSSIM score of resharpened is 0.002177.

Difference images (status quo first):

difference-statusquo.png (70×220 px, 22 KB)

difference-resharpened.png (70×220 px, 23 KB)

The difference (introduced by JPG recompression) seems to be non-negligible for large thumbnails.

Next I'll check what the quality results would be if the 2880 reference image was a lossless resize. Which isn't a silver bullet, since it would require extra storage, while a 2880 reference JPG is used as a thumbnail (and is a Media Viewer bucket).

Looking at a lossless reference thumbnail of 2280, the DSSIM score of resharpened (0.003575) and sharpened once (0.003519) at 220 are actually worse than before.

difference-statusquo.png (70×220 px, 22 KB)

difference-resharpened.png (70×220 px, 24 KB)

difference-frombucket.png (70×220 px, 24 KB)

At 1920, the results are about the same as before. 0.004462 for resharpened and 0.004508 for sharpened once.

difference-statusquo.png (607×1 px, 1 MB)

difference-resharpened.png (607×1 px, 1 MB)

difference-frombucket.png (607×1 px, 1 MB)

It seems like when targeting large sizes, the reference thumbnail is a little unproductive, and when targeting small sizes an uncompressed reference thumbnail is actually worse than before. This is unexpected and might be a result of the JPG compression having a softening effect.

I will redo those tests with more sizes than 220 and 1920, focusing on the DSSIM scores to see if any pattern emerges.

Here are some results: https://docs.google.com/spreadsheets/d/1XnXj1kkYaiGeWR0G-sno2juB7zSAWGCYweXS6mSqT04/edit?usp=sharing

Unlike the status quo, which has predictable DSSIM scores regardless of the target size, the chained DSSIM scores don't seem to follow any apparent rules or patterns, besides "stabilizing" for the smallest target sizes.

Two things seem clear so far, though:

  • compressing the reference thumbnail doesn't impact the final visual quality
  • resharpening the final thumbnail doesn't impact visual quality

It's basically "just as bad" as if the reference thumbnail was a lossless resize and we made sure to never sharpen twice.

Now the question is whether the added visual divergence from a lossless resize is big enough to matter. From these tests, picking the chained thumbnails that score the worst and comparing them to the status quo:

600px Status quo

direct.png (297×600 px, 323 KB)

600px Chained (resharpened)

resharpened.png (297×600 px, 334 KB)

I think that swapping back and forth between those two provides the answer. By rounding off the width twice, we lose the very edge of the image on the thumbnail. Instead of applying a one-size-fits-all reference size, it might be smarter to target one based on the original's dimensions. Or to at least target something that will work well with the most commons sizes.

In the case of this image, we're clearly dealing with an original which is a crop of unusual dimensions: https://commons.wikimedia.org/wiki/File:Stift_Melk_Marmorsaal_Deckenfresko.JPG

The other test image also had odd dimensions: https://commons.wikimedia.org/wiki/File:Kukenan_Roraima_GS.jpg

I'll check what the most common width for an original on Commons is and run tests on an image like that, to see how well we'd be doing in that case.

SELECT COUNT(*) AS count, img_width FROM image WHERE img_minor_mime = 'jpeg' AND img_width > 2880 GROUP BY img_width ORDER BY count DESC

612670 3264
471473 3648
463817 4608
417620 4000
321240 3456
302536 3072
275045 4288
261994 3000
241761 5184
221878 3872
177903 3888
154945 4320
135166 4928
121305 3008
113605 4272
92014 4752
90959 5616
52859 4256
52350 5472
48405 4896
47317 4912
47085 3240
43978 3216
43688 6000

Unlike the two test images I had picked, all of these top ones are even, meaning that if the reference thumbnail targeted half the original, it would do a good job of not dropping edges of the image. However there is a storage drawback of doing so, because half or a quarter of the original would rarely correspond to a common thumbnail size.

There is also the issue of almost all these most common sizes being smaller than 5760 (2 times 2880), meaning that the 2880 Media Viewer bucket would still be rendered from the original. Being the largest, the 2880 thumbnail is the most expensive to render, and the "half original" reference thumbnail would certainly be in the same cost range.

2880 still seems like a good option to try. Sharpening-wise, we should be fine, unlike my previous attempt to deploy chaining, but the loss of edge due to rounding is likely to be unpopular. I'll run tests at 2880 on a couple of images that have the most common dimensions to see if they would be affected. It's possible that the phenomenon was particularly bad for the test images I picked because they had odd widths.

The test with two images that have the most common widths above 2880 (3264 and 3648) shows excellent DSSIM scores for chained thumbnails. This confirms that the issue with the previous scores was entirely due to the rounding cutting off the edge of the image.

Given that the common case performs very well (for example on the last test image the status quo thumbnails achieve an average of 99.82% visual similarity and the chained resharpened thumbnails 99.8%), chaining with a single reference thumbnail of 2880 seems like the way to go.

Should the rounding issue turn out to be too problematic in practice, we should be able to devise a formula that will tell us if given a certain original width, the rounding will cause a tiny crop, in which case we could decide not to apply chaining to that particular image.