Include at least some EXIF metadata in resized pictures
OpenPublic

Description

Author: folengo

Description:
A discussion took place on Wikimedia Commons' Village Pump on 19 May 2009 on how to best respond to the pressure from photographers wanting their names to be credited on article pages (1).

It was suggested that instead of (or in addition to) crediting photographers on article pages, we should make our best efforts to keep the copyright EXIF metadata when available.

It was understood that the underlying reason for excluding EXIF data from resized pictures and thumbnails until now, was the concern that sometimes cameras add an overwhelmingly heavy amount of EXIF metadata.

Our conclusion is that some sort of compromise has to be reached between these two concerns, by including at least some EXIF metatada, if not all of them.

A) Perhaps, really small thumbnails like those used in categories, (that means 120px or smaller) might be allowed to remain void of metadata, while larger thumbnails or resized pictures would compulsorily include at least the most useful metadata.

B) The most useful metadata which should be included in most resized pictures and thumbnails should be :
ImageDescription,
Copyright,
DateTimeOriginal,
DateTime,
GPSLatitudeRef,
GPSLatitude,
GPSLongitudeRef,
GPSLongitude,
GPSAltitudeRef,
GPSAltitude,
IPTC:Credit,
IPTC:CopyrightNotice, and
**IPTC:Byline.

C) The message at the bottom of description pages (is it a [[Mediawiki:]] message ? I have not been able to find which) should be reworded with a less ambiguous wording. When a user reads "This file contains additional information (...) ", he remains clueless on whether that means the original file, or the resized 800px preview present on the description page, or both.

(1) [[COM:VP/Archive/2009/05#Crediting photographers in article pages]]


Version: unspecified
Severity: enhancement
URL: https://commons.wikimedia.org/wiki/Commons:Village_pump/Archive/2009/05#Crediting_photographers_in_article_pages

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz18871.
bzimport created this task.Via LegacyMay 22 2009, 5:51 AM
bzimport added a comment.Via ConduitMay 22 2009, 5:54 AM

folengo wrote:

Erratum : the link to the discussion is [[:Commons:COM:VP#Crediting photographers in article pages]] .

bzimport added a comment.Via ConduitMay 22 2009, 5:57 AM

folengo wrote:

It was also suggested to set UserComment and IPTC:SpecialInstructions to "For more information, see <url of file description page>".

brion added a comment.Via ConduitMay 26 2009, 9:11 PM

See also bug 3361 ("Image author, description, and copyright data saved in EXIF fields") and bug 657 ("Pull copyright metadata from files on upload") for related issues.

brion added a comment.Via ConduitJun 22 2009, 10:16 PM

Agreed, whitelisting metadata to pass through to resized output would be handy here...

PHP's exif support doesn't appear to include writing metadata, though, so we might need to add new code to copy the info or find a way to do it via ImageMagick. The same would be needed to implement bug 3361, which would add new metadata to files based on info from the wiki.

bzimport added a comment.Via ConduitAug 17 2009, 9:46 AM

folengo wrote:

A similar discussion (same request from flickr users to the flickr developers) took place at http://www.flickr.com/photos/x180/3196541234/

TheDJ added a comment.Via ConduitAug 17 2009, 11:23 AM

See also bug 19791

(In reply to comment #2)

Again (full URL) :
http://commons.wikimedia.org/wiki/COM:VP#Crediting_photographers_in_article_pages

A patch for this is in bug 19791

gpaumier added a comment.Via ConduitDec 31 2009, 10:57 PM
  • This bug has been marked as a duplicate of bug 19791 ***
bzimport added a comment.Via ConduitApr 9 2010, 8:26 AM

folengo wrote:

This bug has been marked as "RESOLVED FIXED", as a duplicate of bug 19791 .

However, I checked with file

http://upload.wikimedia.org/wikipedia/commons/thumb/f/fd/Glorious_First_of_June%2C_1_June_1794_%28Monument_to_the_Republic%29_2010-03-23_05.jpg/640px-Glorious_First_of_June%2C_1_June_1794_%28Monument_to_the_Republic%29_2010-03-23_05.jpg

which is a jpg file. And I don't see anything in the Exif of this 640px
thumbnail.

Perhaps the patch provided at bug 19791 is wonderful for png files, but I am
afraid it does not solve the problem for jpg files.

May I know which field of the Exif metadata is supposed to be filled ?

Raymond added a comment.Via ConduitApr 9 2010, 8:33 AM

(In reply to comment #9)

Perhaps the patch provided at bug 19791 is wonderful for png files, but I am
afraid it does not solve the problem for jpg files.

May I know which field of the Exif metadata is supposed to be filled ?

The patch from bug 19791 added the URl of the file source (http://commons.wikimedia.org/wiki/File:Foo.jpg) into the comment field, not into EXIF. It's a kind of workaround until we can have EXIF in the thumbs.

Hint: Works only for thumbs generated from 2009-04-09, not for already cached thumbs.

Therefore the current bug should be furthermore keep open.

bzimport added a comment.Via ConduitApr 9 2010, 4:12 PM

folengo wrote:

I uploaded the file, and the corresponding thumbnail at

http://upload.wikimedia.org/wikipedia/commons/thumb/f/fd/Glorious_First_of_June%2C_1_June_1794_%28Monument_to_the_Republic%29_2010-03-23_05.jpg/640px-Glorious_First_of_June%2C_1_June_1794_%28Monument_to_the_Republic%29_2010-03-23_05.jpg

was generated in 2010. However I don't see anything in the "comment" field either.

My picture viewing software is [[IrfanView]], and I checked the "comment" button close to the "IPTC info" button, at the bottom in the "image information" menu.

I aslo have the Fxlf extension installed on Firefox, and that detects nothing.

Raymond added a comment.Via ConduitApr 9 2010, 4:21 PM

(In reply to comment #11)

I uploaded the file, and the corresponding thumbnail at

http://upload.wikimedia.org/wikipedia/commons/thumb/f/fd/Glorious_First_of_June%2C_1_June_1794_%28Monument_to_the_Republic%29_2010-03-23_05.jpg/640px-Glorious_First_of_June%2C_1_June_1794_%28Monument_to_the_Republic%29_2010-03-23_05.jpg

was generated in 2010. However I don't see anything in the "comment" field
either.

My picture viewing software is [[IrfanView]], and I checked the "comment"
button close to the "IPTC info" button, at the bottom in the "image
information" menu.

I aslo have the Fxlf extension installed on Firefox, and that detects nothing.

As I wrote: Works only for thumbs generated since 2009-04-09, not for already cached thumbs. Yours was generated 2010-03-25.

bzimport added a comment.Via ConduitApr 10 2010, 12:34 AM

folengo wrote:

Sorry, perhaps my English reading capacity is not perfect, but according to Merriam Webster dictionary, http://www.merriam-webster.com/dictionary/since , "since" means "from a definite past time until now" and I would have thought that when you say "since 2009-04-09" you mean between April 2009 and now. Now we are in April 2010 aren't we ? My file was uploaded in March 2010. Isn't March 2010 somewhere between April 2009 and April 2010 ?

Is my file too old or too new ?

Conversely, could you provide another example of existing file whose thumbnails are filled with the new comment ?

Bawolff added a comment.Via ConduitApr 10 2010, 7:12 PM

I just tried it on http://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Torbj%C3%B8rn_R%C3%B8e_Isaksen_-_2010-04-10_at_11-08-52_(1).jpg/120px-Torbj%C3%B8rn_R%C3%B8e_Isaksen_-_2010-04-10_at_11-08-52_(1).jpg

Here's the comment field according to exiftool:

Comment: File source: http://commons.wikimedia.org/wiki/File:TorbjJPEG3%B8rn_RJPEG3%B8e_Isaksen_-_2010-04-10_at_11-08-52_(1).jpg

Thus it does put the url in it, on the downside it seems to totally screw over non-ascii characters, replacing %C with JPEG.

Is %C replaced with the file type somewhere along the line?

Raymond added a comment.Via ConduitApr 10 2010, 7:20 PM

(In reply to comment #13)

Sorry, perhaps my English reading capacity is not perfect, but according to
Merriam Webster dictionary, http://www.merriam-webster.com/dictionary/since ,
"since" means "from a definite past time until now" and I would have thought
that when you say "since 2009-04-09" you mean between April 2009 and now. Now
we are in April 2010 aren't we ? My file was uploaded in March 2010. Isn't
March 2010 somewhere between April 2009 and April 2010 ?

My apologies. I wrote the wrong year :-( Please read it as "2010-04-09". On this date (yesterday) the software on the WMF servers were updated. And all thumbs generated from this date on have the comment.

TheDJ added a comment.Via ConduitApr 11 2010, 12:09 AM

for the imagemagick issue, i have opened a new bug 23148

bzimport added a comment.Via ConduitApr 11 2010, 2:19 PM

folengo wrote:

In reply to comment #16 : you're welcome.

I think this new comment metadata is a great step forward. Too bad, though, that this metadata is less popular than Exif or IPTC (the Exif viewer add-on in Firefox can't display it https://addons.mozilla.org/fr/firefox/addon/3905 ).

Bawolff added a comment.Via ConduitNov 22 2012, 11:55 PM

C) The message at the bottom of description pages (is it a [[Mediawiki:]]
message ? I have not been able to find which) should be reworded with a less
ambiguous wording. When a user reads "This file contains additional information
(...) ", he remains clueless on whether that means the original file, or the
resized 800px preview present on the description page, or both.

This part of the original bug has been ignored. Do you have an alternative sentence that would be more clear - its a very easy fix to change the message if one has an alternative to change it to.


The comment feature seems recently to have stopped working. I filed bug 42368 for that.

JeanFred added a comment.Via ConduitFeb 7 2013, 12:49 PM

Bumping!

I agree with comment 18. The source in the comment is good (though I had never noticed it… neither Firefox nor Nautilus mentionned it − I had to open the file in GIMP), but it does not replace keeping some EXIF (the ones listed in comment 1 seem reasonable).

Lokal_Profil added a comment.Via ConduitFeb 7 2013, 2:11 PM

I just had a similar request from a museum who are planning a large image donation where they had spent a significant effort making sure that the metadata was also included in the exif tags.

For generating their own thumbnails they use a script which first generates the jpg thumbnail with imagemagick and then calls ExifTool to copy the exif data across with
exiftool –all= -tagsfromfile SOURCE.TIF –all:all –overwrite_original TARGET.JPG

Would something similar be doable here? We might not want to copy all of the fields (or for all sizes of thumbnails) but the ones mentioned above would be good.

Bawolff added a comment.Via ConduitFeb 7 2013, 3:45 PM

(In reply to comment #21)

I just had a similar request from a museum who are planning a large image
donation where they had spent a significant effort making sure that the
metadata was also included in the exif tags.

For generating their own thumbnails they use a script which first generates
the
jpg thumbnail with imagemagick and then calls ExifTool to copy the exif data
across with
exiftool –all= -tagsfromfile SOURCE.TIF –all:all –overwrite_original
TARGET.JPG

Would something similar be doable here? We might not want to copy all of the
fields (or for all sizes of thumbnails) but the ones mentioned above would be
good.

post processing with either exiftool or exiv2 is a potential solution to this bug (And probably the best way forward. I don't think we want to write our own metadata writer). Would require making sure such a program is available on the server (which might already be the case).

ImageMagick also has -caption and -title options we can use without resorting to using another binary. It should probably be investigated what sort of metadata these options produce in the image (The docs are very vague, but testing that should be easy). -label Should also be investigated.

Gilles added a project: Multimedia.Via WebNov 24 2014, 3:29 PM
El_Grafo added a subscriber: El_Grafo.Via WebJan 17 2015, 12:38 PM
Hasenlaeufer edited the task description. (Show Details)Via WebJan 17 2015, 12:57 PM
Hasenlaeufer set Security to None.
Hasenlaeufer added a subscriber: Hasenlaeufer.EditedVia WebJan 17 2015, 1:19 PM

My first action here at Phabricator was to update the URL at task description to the archive location. Now I read "Hasenlaeufer set Security to none." This wasn't my intention! What did I wrong?

@Hasenlaeufer Nothing wrong. That change is done automatically for tasks that were imported from bugzilla, if nobody else edited them after the migration. I guess the import leaves that field uninformed, and the following edit updates it with its default value

Ciencia_Al_Poder removed a subscriber: Ciencia_Al_Poder.Via WebJan 17 2015, 1:36 PM
Raymond added a subscriber: Raymond.Via WebJan 17 2015, 1:46 PM
El_Grafo added a comment.Via WebJan 19 2015, 9:58 AM

PHP's exif support doesn't appear to include writing metadata, though, so we might need to add new code to copy the info or find a way to do it via ImageMagick.

Just yesterday I used Imagemagick to resize a bunch of files. Having read this thread before I was quite surprised that all the meta data were still present. I don't really understand what's the problem here? FTR, I used something like

$ convert file.jpg -resize 800x600 thumb.jpg
Bawolff added a comment.Via WebJan 19 2015, 3:46 PM

We use an option to image magick to strip all metadata (except colour profiles). In sone images the metadata can be larger than the thumb itself.

Bawolff added a comment.Via WebJan 19 2015, 3:49 PM

Addendum:

Issue specificly being that image magick doesnt offer a lot of control over which fields. Although perhaps it offers enough to fix the meat of this bug. (True issue like in most dev things, is somebody just has to take the time and go be bold and fix the bug)

Hasenlaeufer added a comment.Via WebJan 19 2015, 11:09 PM

We use an option to image magick to strip all metadata (except colour profiles). In sone images the metadata can be larger than the thumb itself.

Don't remove all metadata. Removing copyright related informations is a criminal act in some countries.

Ragesoss added a subscriber: Ragesoss.Via WebTue, Apr 7, 7:41 PM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.