getimagesize on PHP before 5.3.3 doesn't work for image's that have more than 10 padding bytes between segments
Closed, ResolvedPublic

Description

Author: codrinb

Description:
The thumbnails are not generated for certain images. The dimensions of the images are indicated as 0x0, sometimes the EXIF is not extracted

See the details at:
http://commons.wikimedia.org/wiki/Commons:Help_desk#Thumbnail.2FEXIF_issues_with_uploaded_images

I suspect a bug and I think someone should check the logs for the thumbnail generation scripts


Version: unspecified
Severity: major
URL: http://commons.wikimedia.org/wiki/Commons:Help_desk#Thumbnail.2FEXIF_issues_with_uploaded_images

bzimport added a subscriber: Unknown Object (MLST).
bzimport set Reference to bz31588.
bzimport created this task.Via LegacyOct 10 2011, 4:18 PM
brion added a comment.Via ConduitOct 10 2011, 6:11 PM

[[File:Costesti Cetatuie Dacian Fortress 2011 - Tower House Two.jpg]] seems to still show up 0x0 after some action=purge'ing ... a local upload of the original file to trunk or 1.18wmf1 seems to pick up file info just fine.

Could be a live issue, such as something trying to do the metadata fetch or purge that doesn't actually have NFS access...

bzimport added a comment.Via ConduitOct 10 2011, 10:49 PM

codrinb wrote:

I noticed that the EXIF information for the failed files is incomplete (Caption, Keywords and other fields missing) , by comparison with the successful ones.
For example, compare http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Tower_House_One_Close_Up-3.jpg (successful) compared with http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Stairs_and_Drain.jpg But this might be another bug altogether.

bzimport added a comment.Via ConduitOct 10 2011, 10:50 PM

codrinb wrote:

I noticed that the EXIF information for the failed files is incomplete
(Caption, Keywords and other fields missing) , by comparison with the
successful ones.
For example, compare
http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Tower_House_One_Close_Up-3.jpg
(successful) compared with http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Stairs_and_Drain.jpg
But this might be another bug altogether.
[[commons:File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Stairs_and_Drain.jpg]]

brion added a comment.Via ConduitOct 10 2011, 10:53 PM

Hmm, worth double-checking the version of PHP that's running; might have a less stable older version of the exif module or something.

bzimport added a comment.Via ConduitOct 10 2011, 11:45 PM

codrinb wrote:

Also, if you follow the original link of the posting http://commons.wikimedia.org/wiki/Commons:Help_desk#Thumbnail.2FEXIF_issues_with_uploaded_images, people found some workaround by rotating back and forth the image. But this can't be a solution as there are many files in this state

bzimport added a comment.Via ConduitOct 11 2011, 12:10 AM

codrinb wrote:

This file with a missing thumbnail doesn't even have the EXIF extracted: http://commons.wikimedia.org/wiki/File:Orastie_Ethnography_Museum_2011_-_Dacian_Mandrel_and_Spiral_Bracelet.JPG, while others have a partially extracted EXIF data.

Bawolff added a comment.Via ConduitOct 11 2011, 1:50 PM

(In reply to comment #6)

This file with a missing thumbnail doesn't even have the EXIF extracted:
http://commons.wikimedia.org/wiki/File:Orastie_Ethnography_Museum_2011_-_Dacian_Mandrel_and_Spiral_Bracelet.JPG,
while others have a partially extracted EXIF data

That image doesn't have exif data because mediawiki's jpeg metadata support doesn't skip padding bytes properly, I'll commit a fix for that some time later.

This is probably unrelated to the file not thumbnailing properly.

Bawolff added a comment.Via ConduitOct 11 2011, 2:06 PM

(In reply to comment #7)

(In reply to comment #6)
> This file with a missing thumbnail doesn't even have the EXIF extracted:
> http://commons.wikimedia.org/wiki/File:Orastie_Ethnography_Museum_2011_-_Dacian_Mandrel_and_Spiral_Bracelet.JPG,
> while others have a partially extracted EXIF data

That image doesn't have exif data because mediawiki's jpeg metadata support
doesn't skip padding bytes properly, I'll commit a fix for that some time
later.

r99477 (I still highly doubt this has much to do with the bug being reported here though)

Bawolff added a comment.Via ConduitOct 11 2011, 2:14 PM

r99477 (I still highly doubt this has much to do with the bug being reported
here though)

Actually several of the example images given here have strings of 0xFF padding bytes after the XMP data, perhaps there is some bug with files doing that and php's getimagesize.

bzimport added a comment.Via ConduitOct 11 2011, 4:46 PM

codrinb wrote:

Sounds like you are getting closed to the issue. All I can add is that the images were prepared (downloaded from camera, some rotated, keywords/location/caption added to EXIF) with Picasa 3.8 in Windows 7.

Also, per the initial thread, it seems like if you remove the EXIF Orientation tag from the horizontal pictures, using a tool like exiftool, the Thumbnail and full EXIF can be generated after the upload of the "corrected" image. This doesn't apply to vertical ones, in that they need to also be rotated.

So, in my mind, whatever PHP script tried to read the EXIF metadata upon upload and/or possibly tried to rotate the images, failed.

bzimport added a comment.Via ConduitOct 11 2011, 4:47 PM

codrinb wrote:

Meant to say "closer to". Don't know how to re-edit a posted comment...

Bawolff added a comment.Via ConduitOct 24 2011, 2:59 AM

Sorry I took so long to look at this further.

This is an upstream bug in php ( https://bugs.php.net/bug.php?id=33210 ) It should be fixed in PHP 5.3.3 and later (special:version says we're currently at 5.3.2).

So essentially we need to upgrade php, or i suppose manually apply the relavent fix... (however, since the number of images affected is very small (since you know, most people want their lossy-compression formats to give small file-size and not full it with random padding bytes), this is probably a very low-priority reason to upgrade.

btw, in the mean-time, re-saving the images with another program may "fix" the images.

bzimport added a comment.Via ConduitDec 25 2011, 5:12 PM

codrinb wrote:

If you look at this category on Commons, it looks pretty bad and it is unpractical to reload each image. At least I don't know about any tool to automate this:

http://commons.wikimedia.org/wiki/Category:Costesti_Cetatuie_Dacian_Fortress

Any update on the ETA for this?

MarkAHershberger added a comment.Via ConduitDec 27 2011, 1:33 AM

Bumping the priority since bawolff intrupted my vacation (but, to be honest, I was on IRC)... hopefully someone sees this before I get back

RobLa-WMF added a comment.Via ConduitJan 23 2012, 11:03 PM

Moving priority back down. Mark H, could you make sure an RT ticket is filed for this one?

MarkAHershberger added a comment.Via ConduitJan 24 2012, 2:51 AM

(In reply to comment #15)

Moving priority back down. Mark H, could you make sure an RT ticket is filed
for this one?

https://rt.wikimedia.org/Ticket/Display.html?id=2330

Aklapper added a comment.Via ConduitMar 15 2013, 12:00 PM

For the WMF deployment part, "all application servers are now running 5.3.10-1ubuntu3.4+wmf1" hence Wikimedia servers are not affected anymore.

(In reply to comment #12)

This is an upstream bug in php ( https://bugs.php.net/bug.php?id=33210 ) It
should be fixed in PHP 5.3.3 and later

http://www.mediawiki.org/wiki/Download says
"MediaWiki requires PHP 5.3.2+".

So a requirements bump of MediaWiki would fix this the problem?

Removing "ops" keyword as there's nothing left to do for ops here.

Bawolff added a comment.Via ConduitMay 22 2014, 7:39 PM

I'm going to call this fixed.

*It no longer affects WMF servers
*Its an upstream issue, and upstream has fixed the issue

Its an obscure issue involving an odd feature of a file format, so I don't think we should bump our version requirements just for this issue. However I don't see much point in keeping this bug open. I suggest if anyone else encounters this issue, we just tell them to upgrade.

Gilles raised the priority of this task from "Normal" to "Unbreak Now!".Via WebDec 4 2014, 10:20 AM
Gilles moved this task to Done on the Multimedia workboard.
Gilles added a project: Multimedia.
Gilles lowered the priority of this task from "Unbreak Now!" to "Normal".Via ConduitDec 4 2014, 11:21 AM

Add Comment