Page MenuHomePhabricator

Camera model in EXIF data is wrongly parsed for some photos
Closed, DuplicatePublic

Description

I have recently been uploading from dvidshub.com, and noticed several files where the EXIF data is wrongly displayed. Checking the original files, the EXIF displays perfectly well in photo processing tools.

Example:
File:An Air Force Chief - 39 years old - Army Ranger School - Why not? (Image 1 of 3) 160519-F-HA938-001.jpg
The camera model should display as "Nikon D4", and make as "NIKON CORPORATION", however seems to have been overwritten by a description. The error may lie in the use of auto linking as the corrupt text is shown as English Wikipedia links.

Related Objects

Event Timeline

Fae created this task.Jun 11 2016, 1:27 PM
Restricted Application added subscribers: Zppix, Poyekhali, Steinsplitter, Aklapper. · View Herald TranscriptJun 11 2016, 1:27 PM
Restricted Application added a project: Multimedia. · View Herald TranscriptJun 12 2016, 1:37 AM
Restricted Application added a subscriber: Matanya. · View Herald TranscriptJun 13 2016, 4:44 PM

This is a bug with PHP's exif_read_data() function. I can reproduce it locally with this file. Other EXIF metadata viewers I tried deal with this file just fine.

$ php --version
PHP 5.6.11-1ubuntu3.3 (cli)
Copyright (c) 1997-2015 The PHP Group
Zend Engine v2.6.0, Copyright (c) 1998-2015 Zend Technologies
    with Zend OPcache v7.0.6-dev, Copyright (c) 1999-2015, by Zend Technologies

var_dump( exif_read_data( 'Weird_exif.jpg' ) );

array(53) {
  ["FileName"]=>
  string(14) "Weird_exif.jpg"
  ["FileDateTime"]=>
  int(1465837563)
  ["FileSize"]=>
  int(3186305)
  ["FileType"]=>
  int(2)
  ["MimeType"]=>
  string(10) "image/jpeg"
  ["SectionsFound"]=>
  string(30) "ANY_TAG, IFD0, THUMBNAIL, EXIF"
  ["COMPUTED"]=>
  array(9) {
    ["html"]=>
    string(26) "width="3970" height="2720""
    ["Height"]=>
    int(2720)
    ["Width"]=>
    int(3970)
    ["IsColor"]=>
    int(1)
    ["ByteOrderMotorola"]=>
    int(0)
    ["ApertureFNumber"]=>
    string(5) "f/2.8"
    ["Copyright"]=>
    string(14) "u▒6▒▒▒▒▒▒"
    ["Thumbnail.FileType"]=>
    int(2)
    ["Thumbnail.MimeType"]=>
    string(10) "image/jpeg"
  }
  ["ImageDescription"]=>
  NULL
  ["Make"]=>
  string(18) "Tech. Sgt. Angelit"
  ["Model"]=>
  string(9) "a Lawrenc"
  ["Orientation"]=>
  int(1)
  ["XResolution"]=>
  string(21) "1308634725/1313819465"
  ["YResolution"]=>
  string(21) "1380926240/1095913296"
  ["ResolutionUnit"]=>
  int(2)
  ["Artist"]=>
  string(28) "Tech. Sgt. Angelita Lawrence"
  ["Copyright"]=>
  string(14) "u▒6▒▒▒▒▒▒"
  ["Exif_IFD_Pointer"]=>
  int(464)
  ["THUMBNAIL"]=>
  array(6) {
    ["Compression"]=>
    int(6)
    ["XResolution"]=>
    string(5) "300/1"
    ["YResolution"]=>
    string(5) "300/1"
    ["ResolutionUnit"]=>
    int(2)
    ["JPEGInterchangeFormat"]=>
    int(1206)
    ["JPEGInterchangeFormatLength"]=>
    int(9326)
  }
  ["ExposureTime"]=>
  string(6) "1/1000"
  ["FNumber"]=>
  string(5) "28/10"
  ["ExposureProgram"]=>
  int(1)
  ["ISOSpeedRatings"]=>
  int(200)
  ["UndefinedTag:0x8830"]=>
  int(2)
  ["ExifVersion"]=>
  string(4) "0230"
  ["ShutterSpeedValue"]=>
  string(15) "9965784/1000000"
  ["ApertureValue"]=>
  string(15) "2970854/1000000"
  ["ExposureBiasValue"]=>
  string(3) "0/6"
  ["MaxApertureValue"]=>
  string(5) "30/10"
  ["MeteringMode"]=>
  int(5)
  ["LightSource"]=>
  int(0)
  ["Flash"]=>
  int(0)
  ["FocalLength"]=>
  string(6) "340/10"
  ["ColorSpace"]=>
  int(65535)
  ["ExifImageWidth"]=>
  int(3970)
  ["ExifImageLength"]=>
  int(2720)
  ["SensingMethod"]=>
  int(2)
  ["FileSource"]=>
  string(1) "♥"
  ["SceneType"]=>
  string(1) ""
  ["CFAPattern"]=>
  string(8) "   "
  ["CustomRendered"]=>
  int(0)
  ["ExposureMode"]=>
  int(1)
  ["WhiteBalance"]=>
  int(0)
  ["DigitalZoomRatio"]=>
  string(3) "1/1"
  ["FocalLengthIn35mmFilm"]=>
  int(34)
  ["SceneCaptureType"]=>
  int(0)
  ["GainControl"]=>
  int(0)
  ["Contrast"]=>
  int(0)
  ["Saturation"]=>
  int(0)
  ["Sharpness"]=>
  int(0)
  ["SubjectDistanceRange"]=>
  int(0)
  ["UndefinedTag:0xA431"]=>
  string(7) "2063885"
  ["UndefinedTag:0xA432"]=>
  array(4) {
    [0]=>
    string(6) "240/10"
    [1]=>
    string(6) "700/10"
    [2]=>
    string(5) "28/10"
    [3]=>
    string(5) "28/10"
  }
  ["UndefinedTag:0xA434"]=>
  string(18) "24.0-70.0 mm f/2.8"
}

Here's somebody running into the same problem: http://stackoverflow.com/questions/30590310/php-exif-read-data-returns-wrong-and-incomplete-data

Make and Model are the most obviously wrong ones, but Copyright, XResolution and YResolution also seem broken.

I looked at some other files and this doesn't seem to be an isolated issue. This file: https://commons.wikimedia.org/wiki/File:U.S._Marines_Prepare_to_board_an_MV-22_Osprey_160509-M-AF202-041.jpg also has broken XResolution and YResolution (some other fields look weird too, but I don't know enough about the expected values to tell for sure).

At least six of the affected fields (Copyright, ImageDescription, Make, Model, XResolution and YResolution) are being read from the file from a position 30 bytes before where they are actually located. This results in Copyright being garbage, ImageDescription being NULL, pieces of it being read into Make and Model, and pieces of Make and Model being read into XResolution and YResolution. I didn't try to check other fields.

I guess something in exif_read_data() is calculating offsets wrong. (The code for that looks horrible and I didn't try debugging it, I just looked at the file in a hex editor and compared to the data we're getting.)

This seems to occur for the EXIF fields where the actual data is placed earlier in the file than the structure describing it. I think I actually debugged it, I'll try tomorrow if my fix works and submit a bug and patch upstream to PHP.

matmarex claimed this task.Jun 14 2016, 1:45 AM