This will reduce the size of these images by 2.5kb, which for small thumbnails can be a very significant percentage.
|operations/mediawiki-config : master||Enable TinyRGB ICC profile swapping|
|operations/mediawiki-config : master||Enable TinyRGB ICC profile swapping on testwiki|
|mediawiki/core : master||TinyRGB support for JPG thumbnails|
|operations/puppet : production||Add exiftool to multimedia role|
|operations/puppet : production||Add exiftool to imagescaler role|
|mediawiki/vagrant : master||Add exiftool to multimedia role|
|Duplicate||None||T98987 Epic: Make mobile web more performant|
|Open||None||T125920 [EPIC] Future exciting reading web performance endeavours|
|Open||None||T100546 Reduce amount of metadata embedded in thumbnails across site|
|Resolved||Gilles||T100976 Thumbnails should embed tinyRGB instead of sRGB when they include that ICC profile|
- Mentioned In
- T113136: Imagemagick should use "-strip"
rOMWCbb7387343cb1: Enable TinyRGB ICC profile swapping
rOMWC882762ac1313: Enable TinyRGB ICC profile swapping on testwiki
rMW584a23931854: TinyRGB support for JPG thumbnails
rOPUPe6c56c9a4219: Add exiftool to multimedia role
rMWVA7350e2412593: Add exiftool to multimedia role
- Mentioned Here
- T102870: Reduce amount of thumbnail buckets
T105681: Create media performance dashboard
T20871: Include at least some EXIF metadata in resized pictures
I experimented a little bit and sadly it seems like Facebook's ICC profile does cause some color shift. On the first image I tried, I noticed a slight variation of a purple hue between the sRGB images and the tinyRGB one. This difference was only experienced using desktop software (not browsers) and on an uncalibrated monitor. That would be the default desktop experience for the majority of users, though.
Here's the test image:
You will in all likelihood not see a difference between those images inside a browser.
I think that this is a no-go for now based on the obvious color shift I found on the first test image I picked. While browsers wouldn't be affected, people do download our "thumbnails" for reuse in desktop software, particularly from Media Viewer where we encourage users to download sizes other than the original. The hue differences are probably tolerable for Facebook because the images displayed are only to be viewed in the browser and the UX encourages people to download the original.
Of course in theory we could serve thumbnails "meant for display" using tinyRGB and thumbnails "meant for download" with sRGB, but the storage/caching costs would be huge if we store a copy of both for every size. Unless at some point we decide to treat thumbnails differently based on their resolution... but with the market being currently so divided in terms of device screen density, I find it really hard to find a good limit on the server-side for sizes that we would consider to be thumbnails and sizes we would consider to be non-thumbnails.
Just for reference, the info about the profile at facebook is at https://www.facebook.com/note.php?note_id=10150630639853920 (Just dropping here, as it was hard for me to find when I was looking for it)
They claim that "We checked our results with the official standard for measuring color accuracy (DeltaE CIE94) and it showed our maximum error was about half of what is noticeable by human vision.", which is interesting given gilles comment above. I guess that means you can't tell the difference between a colour managed sRGB and a colour managed tinyRGB, but if you strip out both profiles, you might be able to tell the difference on the un-colour managed version.
I wonder what the results would be if we replaced sRGB with tinyRGB profile, but didn't adjust any of the actual colours in the source image for the new profile.
Out of curiosity, can you notice a difference between the following images (The full size versions, not the phab preview)?
I'm also curious which program you tested in. Web browsers are colour managed, so I assumed that if you were seeing something different on the desktop, the desktop was ignoring colour profiles. However, if you remove the colour profile from both your test images, you get identical (same sha1 hash) image. So if your desktop software is just ignoring profiles, then both images should look identical. So I'm curious what your desktop software is actually doing.
I swapped the new profile in without converting the rgb values, using exiftool. The software where I saw the color differences was Preview.app on OS X desktop, on an uncalibrated monitor. Trying on Photoshop now on that same monitor, I don't see a difference. It could be a bug in Preview.app. Preview.app is definitely interpreting the color profiles, since that's the only difference between the two images, but it could be calculating some colors slightly incorrectly for tinyRGB as opposed to the browsers and Photoshop.
For reference this is how one assigns a new embedded ICC profile using exiftool without touching the pixels:
exiftool "-icc_profile<=tinyrgb.icc" foo.jpg
Where tinyrgb.icc is the file linked in this task's description.
Ah, opening the profile in ColorSync, I do see that the "FB" reference was indeed a copyright:
But sRGB, which we serve by the millions in thumbnails, has one just like it:
So... do we care? We could reach out to Facebook to figure out what the implications of using it would be. I don't think that should stop us from implementing the code, we can always try to use a similar approach Facebook did to create our own open equivalent of tinyRGB should the profile's copyright be problematic.
For reference, the image I made for the comment at Tue, Jun 2, 1:15 PM (I miss bugzilla comment numbers) was made with actually changing the pixels via jpegicc (part of lcms-utils) . I'm not sure if that makes a difference. My reading of the facebook post is that you should be able to just do the exiftool version, but how these colour profile things actually work is a bit over my head.
If/when we actually decide to do this, I think we should just ask facebook. Its not like this colour profile thing is core to their business, hopefully they would be fine with just open sourcing it.
We've contacted Facebook and they've started the discussion on their end between their engineers, legal department and open source folks about making the ICC profile open.
So far I can't find a way to have IM swap ICC profiles in a non-destructive manner. It will always actually apply the profile as a change, like with this command:
convert original.jpg +profile sRGB -profile tinyrgb.icc tinyrgb.jpg
I've verified that it is a destructive operation by trying to convert the tinyrgb image back to srgb after the above command and it indeed doesn't give me back the original data.
It doesn't seem like IM provides any other option. I think that means we'll have to rely on exiftool or similar to achieve this in the least destructive manner possible, by just replacing one profile with the other. Which can easily undo.
Yes it does, because we're basically losing signal, and as a result reducing the quality of the image. The profiles are compatible, we shouldn't let IM's capabilities or lack thereof drive what our code does.
I would imagine that most of the difference is anytime image magick does anything it uncompresses image, and then resaves as jpg (which during shrinking would happen anyways). Although there's also a little bit of clipping when converting between colour profiles (I did a little test and found a few examples of clipping. For example [in decimal], (49,34, 15) -> (49.00,33.90,14.44) = ( 49, 34, 14). but going the other way ( 49, 34, 14) -> (48.98,34.06 ,14.14) = ( 49, 34, 14). So the transformation is a little lossy. Anyways, exiftool sounds good to me. Once its on the server, it would also make T20871 much easier when we decide what to do about that bug.
Shell command using exiftool to swap srgb for tinyrgb only if the image's current profile is sRGB:
exiftool -DeviceModelDesc -S -T srgb.jpg | grep "^IEC 61966-2.1 Default RGB colour space - sRGB$" && exiftool "-icc_profile<=tinyrgb.icc" srgb.jpg
Let's look at some results. These figures don't separate JPEG thumbnails from non-JPEG (because we don't record MIME type in that particular table), but that shouldn't matter since JPEGs make up the vast majority of our thumbnails.
Average content length and load time, for thumbnails accessed within 7 days of the file's upload
SELECT EXP(AVG(LOG(event_contentLength))), EXP(AVG(LOG(event_total))), COUNT(*) FROM MultimediaViewerNetworkPerformance_11030254 WHERE event_uploadTimestamp < 20150622000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND timestamp - event_uploadTimestamp < 7000000
115006.3689547289 371.2216409150824 7229
SELECT EXP(AVG(LOG(event_contentLength))), EXP(AVG(LOG(event_total))), COUNT(*) FROM MultimediaViewerNetworkPerformance_12458951 WHERE event_uploadTimestamp > 20150627000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND timestamp - event_uploadTimestamp < 7000000
103242.23105216665 421.39020297495443 159
Content length by image size
SELECT EXP(AVG(LOG(event_contentLength))), event_imageWidth, COUNT(*) AS count FROM MultimediaViewerNetworkPerformance_12458951 WHERE event_uploadTimestamp < 20150622000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND event_imageWidth IS NOT NULL GROUP BY event_imageWidth ORDER BY count DESC
135952.24079334945 800 17426
157334.42114943176 1024 9187
210751.4154678539 1280 6799
384943.13923461054 1920 3134
37826.17007091689 640 1294
48865.12948684414 600 1179
25314.553831379624 300 868
47616.25390121723 320 826
SELECT EXP(AVG(LOG(event_contentLength))), event_imageWidth, COUNT(*) AS count FROM MultimediaViewerNetworkPerformance_12458951 WHERE event_uploadTimestamp > 20150627000000 AND event_contentLength IS NOT NULL AND event_uploadTimestamp IS NOT NULL AND event_imageWidth IS NOT NULL GROUP BY event_imageWidth ORDER BY count DESC
158279.43586959047 800 46
270074.54488446255 1280 31
206429.13571592575 1024 27
407779.5110025182 1920 14
161508.00000000023 856 13
24444.85052015036 1425 8
51577.30742274269 300 7
36351.41759350178 660 5
45184.08029781603 640 4
The sample sizes for post-deployment data is way too small to draw any conclusions at this point and probably explains the odd results. To be re-run later once more data has accumulated.
Still inconsistent data at this point (average content length slightly higher, but load time much lower). Much like T102870 it's possible that we weren't recording enough data in the first place, which means that there's too much noise to draw conclusions. I think this is confirmation that T105681 is much needed, recording data about all thumbnails, not just a sample of Media Viewer views, to be able to do proper before/after change analysis.
New data after the change:
156549.85626198104 800 926
149254.42218752523 1024 681
194257.85054816256 1280 521
332446.2166522775 1920 271
Still inconsistent for the 800 size. I'm going to close this task, I think this is a shortcoming of the aggressive sampling the Media Viewer stats is doing. We're now tracking unsampled perf data in graphite, future changes of this nature should be easier to assess the success of.
The image table is unhelpful in trying to determine the ratio of JPGs we have that use the sRGB profile, since the img_metadata field doesn't seem to contain that information. It has a ton of EXIF fields, but not that one. I don't think that building something to measure that is worth the effort.