Page MenuHomePhabricator

File does not thumbnail, doesn't have extracted metadata, has reported zero width/height (due to garbage bytes between JPEG sections)
Closed, ResolvedPublic

Event Timeline

matmarex created this task.Oct 19 2016, 5:52 AM
Restricted Application added projects: Multimedia, Commons. · View Herald TranscriptOct 19 2016, 5:52 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The file contains lots of garbage null bytes between JPEG sections. This technically makes it an invalid JPEG file, but it seems that the normal behavior of all tools that handle JPEG files is to just ignore these bytes, or possibly display a warning; we shouldn't be failing on them.

Restricted Application added subscribers: Poyekhali, Matanya. · View Herald TranscriptOct 19 2016, 6:09 AM

Change 316737 had a related patch set uploaded (by Bartosz Dziewoński):
JpegMetadataExtractor: Don't fail when garbage bytes are present between JPEG sections

https://gerrit.wikimedia.org/r/316737

The patch above fixes out metadata extractor, but the same bug is present in PHP's function getimagesize(), which we also use. I think we'd have to fix it somewhere around here… https://github.com/php/php-src/blob/c8fe175c508e635ccf04b5c90913e1933e60280f/ext/standard/image.c#L475

Change 316737 merged by jenkins-bot:
JpegMetadataExtractor: Don't fail when garbage bytes are present between JPEG sections

https://gerrit.wikimedia.org/r/316737

Thumbnail generation is only failing because we can't extract the metadata (in particular, the width/height from getimagesize()).

(Note that a new version of the file was uploaded without the issue, you can still see the old one in file history.)

matmarex claimed this task.Oct 20 2016, 2:39 AM
matmarex triaged this task as Low priority.

A nice surprise, this is already fixed in PHP 5.6.26:

I guess I just need to port that fix to HHVM.

My HHVM patch was merged. Now looking into getting it backported, and deployed in WMF production.

My HHVM patch was merged. Now looking into getting it backported, and deployed in WMF production.

@matmarex: Is there a specific task or Gerrit patch to follow?

No, @MoritzMuehlenhoff said we regularly upgrade to HHVM 3.12 point releases. Upstream said on the pull request that this'll be in the next one.

Oh, 3.12.11 was just released 5 days ago, and it's indeed there (https://github.com/facebook/hhvm/commits/HHVM-3.12.11). So I guess it's a question for Moritz now, when are we planning to upgrade?

matmarex renamed this task from File:Winterville Aerial View HRoe 2016.jpg does not thumbnail, doesn't have extracted metadata, has reported zero width/height to File does not thumbnail, doesn't have extracted metadata, has reported zero width/height (due to garbage bytes between JPEG sections).Nov 27 2016, 12:20 AM
MarkTraceur moved this task from Untriaged to Tracking on the Multimedia board.Nov 28 2016, 5:29 PM

@MoritzMuehlenhoff Do you know when we'll have HHVM 3.12.11 (or the patch https://github.com/facebook/hhvm/pull/7435) in production?

At the moment we're running 3.12.10 plus security fixes from 3.12.11, but your the patch from https://github.com/facebook/hhvm/pull/7435 isn't present yet. We can add it the next time we upgrade HHVM.

@matmarex : Our latest HHVM package (based on 3.12.11 plus additional patches) is now fully deployed across the production cluster.

matmarex closed this task as Resolved.Feb 13 2017, 10:11 PM

Thank you @MoritzMuehlenhoff, I can confirm that the original versions of the two problematic files now thumbnail correctly and have the right metadata shown. I have restored them both. For any other currently affected files, this will be resolved after we do T32961: Run refreshImageMetadata.php --force.