Page MenuHomePhabricator

XMP extraction fails due to way exiftool writes tiff:YCbCrSubSampling tag
Closed, ResolvedPublic

Description

Author: mediazilla

Description:
EXIF data from png files get poorly extracted.
I've cropped a jpg to its relevant content. The result I saved as png. From the original file I copied the exifdata with exiftool -tagsFromFile
Since I saw that the EXIF data of the png file sown at the description page was quite limited I uploaded the file as jpg - at which the EXIF is shown in all it's beauty.

Please compare the result:
http://commons.wikimedia.org/wiki/File:Museum_in_Frashër,_Albania.png
http://commons.wikimedia.org/wiki/File:Museum_in_Frashër,_Albania_tmp.jpg

I prefer to save worked photographs not as jpg to avoid loss of quality which would happen at saving it is (as lossy compressed) jpg.


Version: unspecified
Severity: normal

Details

Reference
bz31944

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:54 PM
bzimport set Reference to bz31944.
bzimport added a subscriber: Unknown Object (MLST).

PNG does not support Exif. exiftool will try to convert as much as possible to PNG text comments or to XMP, but most of the information is lost.

As exiftool explains: http://www.sno.phy.queensu.ca/~phil/exiftool/

A special ExifTool option allows copying tags from one file to another. The command-line syntax for doing this is "-tagsFromFile SRCFILE". Any tags specified after this option on the command line are extracted from source file and written to the destination file. If no tags are specified, then all writable tags are copied. This option is very simple, yet very powerful. Depending on the formats of the source and destination files, some of tags read may not be valid in the destination file, in which case they aren't written.

It might be that exiftool is writing the whole exif block into some PNG part, but then that is not standard, and might not be recognizable by our software.

bawolff might now. I'll see if he can take a look.

Almost every exif tag has an XMP equivalent (well almost, things that are more part of the jpg than metadata don't, but I don't think you're worried about that). So it should be possible to put almost all the data into the png

It might be that exiftool is writing the whole exif block into some PNG part,
but then that is not standard, and might not be recognizable by our software.

yep, it does do that, and you're correct that as of right now we don't support that. (It sticks it in a zTXt section with the name Raw profile type APP1).


However, in the case of this particular file, it looks like most of the data is in the XMP, so we should extract. The issue happens with the tiff:YCbCrSubSampling property. The spec says that this should be an ordered array of integers (specificly eiter [2,1] or [1,1]) but it appears exiftool encodes them as a string of either "2 1" or "1 1". This causes extracting the rest of the tags to fail.

In r101802 I made XMPReader temporary stop extracting tiff:YCbCrSubSampling - really its not that useful a property so probably not very much harm, pending a fix to our XMPReader to make it so that if the types mismatch it only stops extracting the specific property, not xmp data for the entire file.

Thus marking this fixed, and splitting the other issue into bug 32172

This integer bug should be fixed in exiftool soon:
http://u88.n24.queensu.ca/exiftool/forum/index.php/topic,3679.new

Whee, that was fast.

p.s. In regards to the original question, Some of the data in that file is in an embeded exif chunk in the png. We currently don't support that. (bug 32173). However, all this data is possible to embed as XMP, which we do support extracting.

Gilles raised the priority of this task from Medium to Unbreak Now!.Dec 4 2014, 10:27 AM
Gilles added a project: Multimedia.
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to Medium.Dec 4 2014, 11:20 AM