Page MenuHomePhabricator

Exif values retrieved incorrectly if they appear before IFD
Closed, ResolvedPublic

Event Timeline

McZusatz created this task.Apr 26 2015, 4:47 PM
McZusatz raised the priority of this task from to Needs Triage.
McZusatz updated the task description. (Show Details)
McZusatz added a subscriber: McZusatz.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 26 2015, 4:47 PM

For https://commons.wikimedia.org/wiki/File:Sarcophagus_of_Louise_of_Great_Brittain,_Roskilde_Cathedral,_Denmark,_2015-03-31-4813.jpg (and presumably the others although I haven't investigated them), it appears to be a bug in the php exif library. But we should probably narrow it down and get a minimal test case before reporting upstream.

I spent some time debugging this.

I believe the issue occurs when the pointers in the Exif IFD point to locations earlier in the file in the IFD. When that happens, php's exif library seeks directly to that pointer instead of seeking relative to the start of the APP1 segment in the jpeg file. Thus everything is off by the difference between the two starting points (In Sarcophagus_of_Louise_of_Great_Brittain,_Roskilde_Cathedral,_Denmark,_2015-03-31-4813.jpg that's about 12 bytes, so you get tags mismatching with their values)

See exif_process_IFD_TAG in ext/exif/exif.c (in zend php)

Bawolff renamed this task from Wrong metadata is displayed. to Exif values retrieved incorrectly if they appear before IFD.Apr 27 2015, 8:21 AM
Bawolff added a project: Upstream.
Bawolff set Security to None.
Rillke added a subscriber: Rillke.Apr 27 2015, 11:25 AM
Ireas added a subscriber: Ireas.Apr 29 2015, 2:20 PM
Lupo added a subscriber: Lupo.May 5 2015, 2:57 PM

And what's the upstream bug report?

I think every "upstream" issue should have an upstream bug report, and mention it here, too.

Perhaps https://bugs.php.net/bug.php?id=50845 opened in 2010?

I think every "upstream" issue should have an upstream bug report, and mention it here, too.

+1

Perhaps https://bugs.php.net/bug.php?id=50845 opened in 2010?

Why took it 5 years for the bug to reach us? Was there an update of our php to an affected version in the last 30 to 60 days?

TheDJ added a subscriber: TheDJ.May 13 2015, 2:44 PM

Why took it 5 years for the bug to reach us? Was there an update of our php to an affected version in the last 30 to 60 days?

@McZusatz, another option is that files with this format have become more common, because some device or application that produces these files has become more popular.

If the images are exported from Lightroom 6, the EXIF-Data are not correct readable in PHP. This is a LR / PHP - Problem. See: https://forums.adobe.com/thread/1835290 and http://feedback.photoshop.com/photoshop_family/topics/lightroom-cc-jpeg-format-tripping-up-other-programs

Restricted Application added a subscriber: Matanya. · View Herald TranscriptJul 13 2015, 7:00 PM

New, here, so sorry in advance for not knowing the circuitry :-)

Any idea when this bug will be fixed and deployed on Commons?

Could the priority be escalated?

Currently as more and more people are uploading photos generated using the extremely popular Lightroom 6.x the problem is escalating and new users are affected.
https://commons.wikimedia.org/wiki/Commons:Featured_picture_candidates/File:141227_Berliner_Dom.jpg
I have tried to summarize some methods I have found for mitigating the issue here:
https://commons.wikimedia.org/wiki/User:Slaunger/Mitigating_Mediawiki_Metadata_Viewer_Bug

A question:

When the bug is fixed, will the metadata shown on the file pages of those files, which currently display corrupted data automatically be fixed?

Aklapper triaged this task as Normal priority.Jul 16 2015, 2:05 PM

Is anybody from Multimedia investigating this (or does this actually need more investigation on the Wikimedia side, like a minimal textcase)?
However if this is really https://bugs.php.net/bug.php?id=50845 then an upstream patch is required, not much to do for Wikimedia itself...

Colin added a subscriber: Colin.Aug 4 2015, 8:49 AM

I'd like to add my voice to those requesting this fix ASAP. We've had yet another post to the Village Pump from users wondering why their EXIF data is displayed incorrectly. This wastes people's time investigating if there is a problem with their own image software.

I'd also like to know, if the problem is fixed, whether existing images would have their EXIF reported correctly. Or do you have to refresh the data held by MediaWiki? If the latter, then some search on images created by Lightroom 6 (and possibly Photoshop CC 2015) would target those in need of refresh.

Restricted Application added a subscriber: Steinsplitter. · View Herald TranscriptAug 4 2015, 8:49 AM

I'd like to add my voice to those requesting this fix ASAP.

Well, as this seems to be a problem in PHP, anybody can and should provide a patch to PHP. https://bugs.php.net/bug.php?id=50660 and https://bugs.php.net/bug.php?id=50845 have been mentioned in this task and that's where this problem can and should get solved.

Adding @Smalyshev for PHP issues...

Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 6:23 PM
Restricted Application added a project: Commons. · View Herald TranscriptJun 14 2016, 12:04 PM

@Smalyshev merged the PHP patch (http://git.php.net/?p=php-src.git;a=commit;h=1ab5a1b432a4b4c62171864bd1b545616e1b07db). This fixes the most common problem mentioned here, bogus values being read when the data is before the IFD structure (https://bugs.php.net/bug.php?id=50845). It will be a part of PHP 5.6.24.

I have no idea what it would take to have it deployed in Wikimedia production today, especially since we actually use HHVM. Porting it should be straightforward, @Smalyshev pointed to https://github.com/facebook/hhvm/blob/ea6ff01f6c31f1615a935ef96622d623a6277d37/hphp/runtime/ext/gd/ext_gd.cpp#L6584.

zhuyifei1999 moved this task from Incoming to Backlog on the Commons board.Jun 19 2016, 4:38 AM

HHVM patch is also merged (https://reviews.facebook.net/rHHVM255373a80a9b9c8b1b452f902e394cd9773729cd). Now, I wonder what I need to do to get it to our servers.

matmarex closed this task as Resolved.Jul 17 2016, 10:01 PM

This was never a MediaWiki bug, but rather an upstream issue with PHP and HHVM (https://bugs.php.net/bug.php?id=50845). I fixed it with the following patches:

The patches should be included in the following releases:

  • PHP 5.6.24, 7.0.9 and 7.1.0
  • HHVM 3.15.0

If you experience this problem on a non-Wikimedia wiki, its version of PHP or HHVM must be upgraded to one of the above (or later).

For deployment of these fixes on Wikimedia wikis, let's continue at T140419.

(The patch was also backported to HHVM 3.12.8.)