Page MenuHomePhabricator

Thumbor hangs on some TIFF files
Closed, ResolvedPublic

Description

https://commons.wikimedia.org/wiki/File:J%C3%A9r%C3%B4me_Lejeune.TIF

It seems to hang with no end in sight when it tries to resize it:

2016-11-23 13:46:01,034 8841 thumbor:DEBUG [ImagesHandler] translate: {'lang': None, 'end': u'J\xe9r\xf4me_Lejeune', 'extension': u'TIF', 'language': u'commons', 'format': u'TIF', 'shard2': u'99', 'filename': u'J\xe9r\xf4me_Lejeune', 'project': u'wikipedia', 'width': u'200', 'lossy': None, 'qlow': None, 'specialpath': None, 'seek': None, 'shard1': u'9', 'page': None}
2016-11-23 13:46:01,035 8841 thumbor:DEBUG [Swift] get: u'wikipedia-commons-local-thumb.99' u'thumbor/9/99/J\xe9r\xf4me_Lejeune.TIF/200px-J\xe9r\xf4me_Lejeune.TIF'
2016-11-23 13:46:01,151 8841 thumbor:DEBUG [Swift] missing
2016-11-23 13:46:01,152 8841 thumbor:DEBUG [REQUEST_STORAGE] get: http%3A//ms-fe.svc.eqiad.wmnet/v1/AUTH_mw/wikipedia-commons-local-public.99/9/99/J%C3%A9r%C3%B4me_Lejeune.TIF
2016-11-23 13:46:01,152 8841 thumbor:DEBUG [REQUEST_STORAGE] missing
2016-11-23 13:46:01,152 8841 thumbor:DEBUG [HTTPS] load_sync: http%3A//ms-fe.svc.eqiad.wmnet/v1/AUTH_mw/wikipedia-commons-local-public.99/9/99/J%C3%A9r%C3%B4me_Lejeune.TIF
2016-11-23 13:46:01,251 8841 thumbor:DEBUG [HTTPS] return_contents: /tmp/tmpAaoS74
2016-11-23 13:46:01,252 8841 thumbor:DEBUG [Proxy] Looking for a tiff engine
2016-11-23 13:46:01,252 8841 thumbor:DEBUG [ExiftoolRunner] command: ['/usr/bin/exiftool', '-ImageSize', '-s', '-s', '-s', '/tmp/tmpAaoS74']
2016-11-23 13:46:01,252 8841 thumbor:DEBUG [ShellRunner] Command: ['/usr/bin/timeout', '--foreground', '60', '/usr/bin/exiftool', '-ImageSize', '-s', '-s', '-s', '/tmp/tmpAaoS74']
2016-11-23 13:46:01,411 8841 thumbor:DEBUG Stdout: 2100x1417

2016-11-23 13:46:01,412 8841 thumbor:DEBUG Stderr: 
2016-11-23 13:46:01,412 8841 thumbor:DEBUG Return code: 0
2016-11-23 13:46:01,412 8841 thumbor:DEBUG Duration: 159.006
2016-11-23 13:46:01,459 8841 thumbor:DEBUG [BWE] Reading the original
2016-11-23 13:46:01,460 8841 thumbor:DEBUG [REQUEST_STORAGE] put: http%3A//ms-fe.svc.eqiad.wmnet/v1/AUTH_mw/wikipedia-commons-local-public.99/9/99/J%C3%A9r%C3%B4me_Lejeune.TIF
2016-11-23 13:46:01,460 8841 thumbor:DEBUG [IM] reorientate
2016-11-23 13:46:01,461 8841 thumbor:DEBUG [IM] resize: 200.0 135.0
2016-11-23 13:46:01,614 8841 thumbor:DEBUG Format not allowed: tif
2016-11-23 13:46:01,614 8841 thumbor:DEBUG No image format specified. Retrieving from the image extension: .tiff.
2016-11-23 13:46:01,614 8841 thumbor:DEBUG Content Type of image/tiff detected.
2016-11-23 13:46:01,614 8841 thumbor:DEBUG [BWE] Defaulting to .jpg
2016-11-23 13:46:01,614 8841 thumbor:DEBUG [IM] read: '.jpg' 87
2016-11-23 13:46:01,614 8841 thumbor:DEBUG [IM] Chroma subsampling: '4:2:0'
2016-11-23 13:46:01,614 8841 thumbor:DEBUG [IM] Generating image with quality 87


2016-11-23 13:48:01,349 8841 thumbor:DEBUG [HTTPS] cleanup_temp_file: /tmp/tmpAaoS74

Revisions and Commits

rTHMBREXT Thumbor Plugins
Restricted Differential Revision

Event Timeline

Gilles renamed this task from Thumbor chokes on a specific TIF file to Thumbor hangs on some TIFF files.Nov 23 2016, 2:18 PM

Likely cause for the first example:

2016-11-23 15:09:40,841 8841 tornado.application:ERROR Future exception was never retrieved: Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/tornado/gen.py", line 215, in wrapper
    result = func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/thumbor/handlers/__init__.py", line 331, in finish_request
    callback=inner,
  File "/usr/lib/python2.7/dist-packages/thumbor/context.py", line 276, in queue
    self._execute_in_foreground(operation, callback)
  File "/usr/lib/python2.7/dist-packages/thumbor/context.py", line 263, in _execute_in_foreground
    result.set_result(operation())
  File "/usr/lib/python2.7/dist-packages/thumbor/handlers/__init__.py", line 267, in _load_results
    results = context.request.engine.read(image_extension, quality)
  File "/usr/lib/python2.7/dist-packages/wikimedia_thumbor/engine/proxy/proxy.py", line 132, in read
    ret = self.__getattr__('read')(extension, quality)
  File "/usr/lib/python2.7/dist-packages/wikimedia_thumbor/engine/__init__.py", line 69, in read
    return super(BaseWikimediaEngine, self).read(extension, quality)
  File "/usr/lib/python2.7/dist-packages/wikimedia_thumbor/engine/imagemagick/imagemagick.py", line 271, in read
    result = self.image.make_blob(format=extension)
  File "/usr/lib/python2.7/dist-packages/wand/image.py", line 2276, in make_blob
    with self.convert(format) as converted:
  File "/usr/lib/python2.7/dist-packages/wand/image.py", line 2200, in convert
    cloned = self.clone()
  File "/usr/lib/python2.7/dist-packages/wand/image.py", line 502, in clone
    return Image(image=self)
  File "/usr/lib/python2.7/dist-packages/wand/image.py", line 1989, in __init__
    self.raise_exception()
  File "/usr/lib/python2.7/dist-packages/wand/resource.py", line 218, in raise_exception
    raise e
CoderError: Failed to read custom directory at offset 3003040. `TIFFReadCustomDirectory' @ error/tiff.c/TIFFErrors/561

Same error coming from the second example.

OK, I was able to reproduce this with the following minimal case:

from wand import image
from wand.api import library

img = image.Image()

with open('Hafnia_alvei.tif') as f:
    buffer = f.read()

img.read(blob=buffer)

with open('out.jpg', 'wb') as f:
    f.write(img.make_blob(format='jpg'))

Next, I wanted a backtrace for the TIFFReadCustomDirectory call, so I used gdb:

# gdb --quiet --args python test.py
Reading symbols from python...(no debugging symbols found)...done.
(gdb) break TIFFReadCustomDirectory
Function "TIFFReadCustomDirectory" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (TIFFReadCustomDirectory) pending.
(gdb) run
Starting program: /usr/bin/python test.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, 0x00007ffff2591f70 in TIFFReadCustomDirectory () from /usr/lib/x86_64-linux-gnu/libtiff.so.5
(gdb) bt 5
#0  0x00007ffff2591f70 in TIFFReadCustomDirectory () from /usr/lib/x86_64-linux-gnu/libtiff.so.5
#1  0x00007ffff27fbb6a in TIFFGetEXIFProperties (image=<optimized out>, tiff=<optimized out>) at ../../coders/tiff.c:692
#2  ReadTIFFImage (image_info=0xb4b070, exception=0x565e68) at ../../coders/tiff.c:1143
#3  0x00007ffff584a406 in ReadImage (image_info=0xb333e0, exception=0x565e68) at ../../magick/constitute.c:547
#4  0x00007ffff5816240 in BlobToImage (image_info=0xb13d80, blob=0x7ffff2a01034, length=2693540, exception=0xafc650) at ../../magick/blob.c:376
(More stack frames follow...)

From the backtrace, we can see that TIFFReadCustomDirectory is called by TIFFGetEXIFProperties. Could we try to skip reading EXIF data? The ImageMagick docs say we can: "Use -define tiff:exif-properties=false to skip reading the EXIF properties."

Since we're interfacing with ImageMagick via its C API rather than the command line, we have to do this a bit differently. We can add this line right after the call to image.Image():

library.MagickSetImageOption(img.wand, 'tiff', 'exif-properties', 'no')

Does the conversion complete successfully, now? Yes, it does -- for all three images cited above.

(Note that thumbor sets ImageMagick options a bit differently -- it monkey-patches wand.image.OPTIONS, and then uses img.options['some-option'] = 'value'. Doing it that way would work, too.)

The options has to be set before the call to im.read in Engine.create_image, but wand only checks for errors when we try to convert. You can force error-checking in create_image by doing something like this:

from wand.exceptions import CoderError

def read_tiff(buffer):
    img = image.Image()
    img.read(blob=buffer)
    try:
        img.raise_exception()
    except CoderError as e:
        if 'TIFFReadCustomDirectory' in e.message:
            # Bad EXIF; try again but skip EXIF data.
            img = image.Image()
            library.MagickSetImageOption(
                img.wand, 'tiff', 'exif-properties', 'no')
            img.read(blob=buffer)
        else:
            raise
    return img
Gilles added a revision: Restricted Differential Revision.Nov 24 2016, 2:30 PM

Change 324919 had a related patch set uploaded (by Gilles):
Upgrade to 0.1.30

https://gerrit.wikimedia.org/r/324919

Change 324919 merged by Filippo Giunchedi:
Upgrade to 0.1.30

https://gerrit.wikimedia.org/r/324919