Thumbnails should embed tinyRGB instead of sRGB when they include that ICC profile
Closed, ResolvedPublic

Description

courtesy of Facebook which set out to make a lighter version of sRGB.

This will reduce the size of these images by 2.5kb, which for small thumbnails can be a very significant percentage.

Gilles created this task.Jun 1 2015, 11:30 AM
Gilles updated the task description. (Show Details)
Gilles raised the priority of this task from to Normal.
Gilles claimed this task.
Gilles added subscribers: Bawolff, Spage, Raymond and 7 others.

I experimented a little bit and sadly it seems like Facebook's ICC profile does cause some color shift. On the first image I tried, I noticed a slight variation of a purple hue between the sRGB images and the tinyRGB one. This difference was only experienced using desktop software (not browsers) and on an uncalibrated monitor. That would be the default desktop experience for the majority of users, though.

Here's the test image:

sRGB

tinyRGB

You will in all likelihood not see a difference between those images inside a browser.

Krinkle removed a subscriber: Krinkle.Jun 1 2015, 12:52 PM
Steinsplitter moved this task from Incoming to Backlog on the Commons board.Jun 1 2015, 2:15 PM
Gilles closed this task as Declined.Jun 2 2015, 9:07 AM

I think that this is a no-go for now based on the obvious color shift I found on the first test image I picked. While browsers wouldn't be affected, people do download our "thumbnails" for reuse in desktop software, particularly from Media Viewer where we encourage users to download sizes other than the original. The hue differences are probably tolerable for Facebook because the images displayed are only to be viewed in the browser and the UX encourages people to download the original.

Of course in theory we could serve thumbnails "meant for display" using tinyRGB and thumbnails "meant for download" with sRGB, but the storage/caching costs would be huge if we store a copy of both for every size. Unless at some point we decide to treat thumbnails differently based on their resolution... but with the market being currently so divided in terms of device screen density, I find it really hard to find a good limit on the server-side for sizes that we would consider to be thumbnails and sizes we would consider to be non-thumbnails.

Just for reference, the info about the profile at facebook is at https://www.facebook.com/note.php?note_id=10150630639853920 (Just dropping here, as it was hard for me to find when I was looking for it)

They claim that "We checked our results with the official standard for measuring color accuracy (DeltaE CIE94) and it showed our maximum error was about half of what is noticeable by human vision.", which is interesting given gilles comment above. I guess that means you can't tell the difference between a colour managed sRGB and a colour managed tinyRGB, but if you strip out both profiles, you might be able to tell the difference on the un-colour managed version.

I wonder what the results would be if we replaced sRGB with tinyRGB profile, but didn't adjust any of the actual colours in the source image for the new profile.

I experimented a little bit and sadly it seems like Facebook's ICC profile does cause some color shift. On the first image I tried, I noticed a slight variation of a purple hue between the sRGB images and the tinyRGB one. This difference was only experienced using desktop software (not browsers) and on an uncalibrated monitor. That would be the default desktop experience for the majority of users, though.

Here's the test image:

sRGB

tinyRGB

You will in all likelihood not see a difference between those images inside a browser.

Out of curiosity, can you notice a difference between the following images (The full size versions, not the phab preview)?

sRGB

alternate tinyrgb

I'm also curious which program you tested in. Web browsers are colour managed, so I assumed that if you were seeing something different on the desktop, the desktop was ignoring colour profiles. However, if you remove the colour profile from both your test images, you get identical (same sha1 hash) image. So if your desktop software is just ignoring profiles, then both images should look identical. So I'm curious what your desktop software is actually doing.

Gilles reopened this task as Open.Jun 2 2015, 2:31 PM

I swapped the new profile in without converting the rgb values, using exiftool. The software where I saw the color differences was Preview.app on OS X desktop, on an uncalibrated monitor. Trying on Photoshop now on that same monitor, I don't see a difference. It could be a bug in Preview.app. Preview.app is definitely interpreting the color profiles, since that's the only difference between the two images, but it could be calculating some colors slightly incorrectly for tinyRGB as opposed to the browsers and Photoshop.

Gilles added a comment.Jun 2 2015, 2:33 PM

Either that, or I have superhuman color vision...

Gilles added a comment.Jun 2 2015, 2:39 PM

For reference this is how one assigns a new embedded ICC profile using exiftool without touching the pixels:

exiftool "-icc_profile<=tinyrgb.icc" foo.jpg

Where tinyrgb.icc is the file linked in this task's description.

Gilles added a comment.Jun 2 2015, 2:47 PM

Ah, opening the profile in ColorSync, I do see that the "FB" reference was indeed a copyright:

But sRGB, which we serve by the millions in thumbnails, has one just like it:

So... do we care? We could reach out to Facebook to figure out what the implications of using it would be. I don't think that should stop us from implementing the code, we can always try to use a similar approach Facebook did to create our own open equivalent of tinyRGB should the profile's copyright be problematic.

For reference, the image I made for the comment at Tue, Jun 2, 1:15 PM (I miss bugzilla comment numbers) was made with actually changing the pixels via jpegicc (part of lcms-utils) . I'm not sure if that makes a difference. My reading of the facebook post is that you should be able to just do the exiftool version, but how these colour profile things actually work is a bit over my head.

If/when we actually decide to do this, I think we should just ask facebook. Its not like this colour profile thing is core to their business, hopefully they would be fine with just open sourcing it.

Gilles added a comment.EditedJun 3 2015, 8:23 AM

We've contacted Facebook and they've started the discussion on their end between their engineers, legal department and open source folks about making the ICC profile open.

So far I can't find a way to have IM swap ICC profiles in a non-destructive manner. It will always actually apply the profile as a change, like with this command:

convert original.jpg +profile sRGB -profile tinyrgb.icc tinyrgb.jpg

I've verified that it is a destructive operation by trying to convert the tinyrgb image back to srgb after the above command and it indeed doesn't give me back the original data.

It doesn't seem like IM provides any other option. I think that means we'll have to rely on exiftool or similar to achieve this in the least destructive manner possible, by just replacing one profile with the other. Which can easily undo.

Change 215605 had a related patch set uploaded (by Gilles):
Add exiftool to multimedia role

https://gerrit.wikimedia.org/r/215605

If we are shrinking the image at the same time, does it matter that the profile add is destructive?

Awesome stuff guys. Happy to see this happening! :)

If we are shrinking the image at the same time, does it matter that the profile add is destructive?

Yes it does, because we're basically losing signal, and as a result reducing the quality of the image. The profiles are compatible, we shouldn't let IM's capabilities or lack thereof drive what our code does.

Change 215605 merged by jenkins-bot:
Add exiftool to multimedia role

https://gerrit.wikimedia.org/r/215605

I would imagine that most of the difference is anytime image magick does anything it uncompresses image, and then resaves as jpg (which during shrinking would happen anyways). Although there's also a little bit of clipping when converting between colour profiles (I did a little test and found a few examples of clipping. For example [in decimal], (49,34, 15) -> (49.00,33.90,14.44) = ( 49, 34, 14). but going the other way ( 49, 34, 14) -> (48.98,34.06 ,14.14) = ( 49, 34, 14). So the transformation is a little lossy. Anyways, exiftool sounds good to me. Once its on the server, it would also make T20871 much easier when we decide what to do about that bug.

Gilles added a subscriber: fred.Jun 9 2015, 7:09 AM

@fred any news about the tinyRGB licensing discussion on FB's side?

Krinkle moved this task from Tag to Next-up on the Performance board.Jun 10 2015, 10:29 PM
fred added a comment.Jun 12 2015, 4:58 PM

We're likely to make this available under CC0 - hopefully will have this finalized soon :)

fred added a comment.Jun 12 2015, 5:39 PM

Yep:

TinyRGB can be redistributed under the terms of CC0, available at https://creativecommons.org/publicdomain/zero/1.0/

Glaisher removed a subscriber: Glaisher.Jun 12 2015, 5:39 PM

Yep:

TinyRGB can be redistributed under the terms of CC0, available at https://creativecommons.org/publicdomain/zero/1.0/

Thank you very much for making this happen! I'll start working on the implementation of our tinyRGB support right away.

Shell command using exiftool to swap srgb for tinyrgb only if the image's current profile is sRGB:

exiftool -DeviceModelDesc -S -T srgb.jpg | grep "^IEC 61966-2.1 Default RGB colour space - sRGB$" && exiftool "-icc_profile<=tinyrgb.icc" srgb.jpg

Change 218372 had a related patch set uploaded (by Gilles):
TinyRGB support for JPG uploads

https://gerrit.wikimedia.org/r/218372

Change 218376 had a related patch set uploaded (by Gilles):
Add exiftool to multimedia role

https://gerrit.wikimedia.org/r/218376

Change 218378 had a related patch set uploaded (by Gilles):
Add exiftool to imagescaler role

https://gerrit.wikimedia.org/r/218378

Change 218378 abandoned by Gilles:
Add exiftool to imagescaler role

Reason:
Pushed to the wrong branch

https://gerrit.wikimedia.org/r/218378

Krinkle set Security to None.Jun 18 2015, 12:09 AM
Krinkle moved this task from Inbox to Next-up on the Performance-Team board.

Change 218376 merged by Dzahn:
Add exiftool to multimedia role

https://gerrit.wikimedia.org/r/218376

Gilles moved this task from Next-up to Doing on the Performance-Team board.Jun 18 2015, 7:19 PM

Change 218372 merged by jenkins-bot:
TinyRGB support for JPG thumbnails

https://gerrit.wikimedia.org/r/218372

Change 219867 had a related patch set uploaded (by Gilles):
Enable TinyRGB ICC profile swapping on testwiki

https://gerrit.wikimedia.org/r/219867

Change 219867 merged by jenkins-bot:
Enable TinyRGB ICC profile swapping on testwiki

https://gerrit.wikimedia.org/r/219867

Works fine on testwiki. I'll prepare a config change to deploy it everywhere after the wmf11 train on Thursday.

Change 220485 had a related patch set uploaded (by Gilles):
Enable TinyRGB ICC profile swapping

https://gerrit.wikimedia.org/r/220485

Change 220485 merged by jenkins-bot:
Enable TinyRGB ICC profile swapping

https://gerrit.wikimedia.org/r/220485

Restricted Application added a subscriber: Steinsplitter. · View Herald TranscriptJun 29 2015, 5:49 PM
Gilles added a comment.EditedJul 7 2015, 11:07 AM

Let's look at some results. These figures don't separate JPEG thumbnails from non-JPEG (because we don't record MIME type in that particular table), but that shouldn't matter since JPEGs make up the vast majority of our thumbnails.

Average content length and load time, for thumbnails accessed within 7 days of the file's upload

Before:

SELECT EXP(AVG(LOG(event_contentLength))), EXP(AVG(LOG(event_total))), COUNT(*) FROM MultimediaViewerNetworkPerformance_11030254 WHERE event_uploadTimestamp < 20150622000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND timestamp - event_uploadTimestamp < 7000000

115006.3689547289 371.2216409150824 7229

After:

SELECT EXP(AVG(LOG(event_contentLength))), EXP(AVG(LOG(event_total))), COUNT(*) FROM MultimediaViewerNetworkPerformance_12458951 WHERE event_uploadTimestamp > 20150627000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND timestamp - event_uploadTimestamp < 7000000

103242.23105216665 421.39020297495443 159

Content length by image size

Before:

SELECT EXP(AVG(LOG(event_contentLength))), event_imageWidth, COUNT(*) AS count FROM MultimediaViewerNetworkPerformance_12458951 WHERE event_uploadTimestamp < 20150622000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND event_imageWidth IS NOT NULL GROUP BY event_imageWidth ORDER BY count DESC

135952.24079334945 800 17426
157334.42114943176 1024 9187
210751.4154678539 1280 6799
384943.13923461054 1920 3134
37826.17007091689 640 1294
48865.12948684414 600 1179
25314.553831379624 300 868
47616.25390121723 320 826

After:

SELECT EXP(AVG(LOG(event_contentLength))), event_imageWidth, COUNT(*) AS count FROM MultimediaViewerNetworkPerformance_12458951 WHERE event_uploadTimestamp > 20150627000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND event_imageWidth IS NOT NULL GROUP BY event_imageWidth ORDER BY count DESC

158279.43586959047 800 46
270074.54488446255 1280 31
206429.13571592575 1024 27
407779.5110025182 1920 14
161508.00000000023 856 13
24444.85052015036 1425 8
51577.30742274269 300 7
36351.41759350178 660 5
45184.08029781603 640 4

The sample sizes for post-deployment data is way too small to draw any conclusions at this point and probably explains the odd results. To be re-run later once more data has accumulated.

Still inconsistent data at this point (average content length slightly higher, but load time much lower). Much like T102870 it's possible that we weren't recording enough data in the first place, which means that there's too much noise to draw conclusions. I think this is confirmation that T105681 is much needed, recording data about all thumbnails, not just a sample of Media Viewer views, to be able to do proper before/after change analysis.

Gilles moved this task from Doing to Backlog on the Performance-Team board.Jul 21 2015, 8:47 PM
Gilles closed this task as Resolved.Aug 5 2015, 8:19 AM

New data after the change:

156549.85626198104 800 926
149254.42218752523 1024 681
194257.85054816256 1280 521
332446.2166522775 1920 271

Still inconsistent for the 800 size. I'm going to close this task, I think this is a shortcoming of the aggressive sampling the Media Viewer stats is doing. We're now tracking unsampled perf data in graphite, future changes of this nature should be easier to assess the success of.

The image table is unhelpful in trying to determine the ratio of JPGs we have that use the sRGB profile, since the img_metadata field doesn't seem to contain that information. It has a ton of EXIF fields, but not that one. I don't think that building something to measure that is worth the effort.