Page MenuHomePhabricator

Specific Thumbnail generation for a broken invalid TIF file not working anymore
Open, Needs TriagePublic

Description

Since an unknown date, the thumbnail generation for an .tif file on Commons is no longer working. The file:
https://commons.wikimedia.org/wiki/File:Tessie_Reynolds_02.tif
It was working at least 1.5 years ago when I was editing the file description. The file is still in use in the german article. Currently the usage in enWP has been replaced with its JPG equivalent:
https://commons.wikimedia.org/wiki/File:Tessie_Reynolds_02.jpg

It should be checked if only the file is broken (works with an image viewer under Windows) or if there could be other files affected.

Beste regards,
#Reaper

Event Timeline

Aklapper renamed this task from Thumbnail generation for TIF file no longer working to Specific Thumbnail generation for TIF file not working anymore.Apr 23 2016, 9:54 AM
Aklapper added a project: SRE-swift-storage.

Downloading https://upload.wikimedia.org/wikipedia/commons/7/7e/Tessie_Reynolds_02.tif and using libtiff-4.0.10 to run tiffinfo -D Tessie_Reynolds_02.tif, I get:

TIFFFetchDirectory: Can not read TIFF directory count.
TIFFReadCustomDirectory: Failed to read custom directory at offset 4048276.

Successfully opening the file with gimp-2.10.8 I get a warning:

** (file-tiff:10145): CRITICAL **: 18:06:03.085: Directory Image, entry 0x8769 Sub-IFD pointer 0 is out of bounds; ignoring it.

https://commons.wikimedia.org/wiki/File:Tessie_Reynolds_02.tif has been deleted, which makes it rather more difficult to debug.

https://commons.wikimedia.org/wiki/File:Ophichthys_rhytidoderma_-_1864_-_Print_-_Iconographia_Zoologica_-_Special_Collections_University_of_Amsterdam_-_UBA01_IZ15200057.tif was fixed in T219569. The wiki wouldn't affect the thumbnail directly, but it would affect the size. It's possible that the 220px and 266px thumbnails were generated and cached before the bug was introduced, but the 291px and 160px images were not.

https://commons.wikimedia.org/wiki/File:Tessie_Reynolds_02.tif has been deleted, which makes it rather more difficult to debug.

I've gone ahead and undeleted the file for your debugging purposes.

This file is below the minimum area to use Vips, so it's processed entirely with ImageMagick. Looking at the TIFF with ImageMagick throws these errors:

$ identify -verbose Tessie_Reynolds_02.tif > /dev/null
identify: Can not read TIFF directory count. `TIFFFetchDirectory' @ error/tiff.c/TIFFErrors/661.
identify: Failed to read custom directory at offset 4048276. `TIFFReadCustomDirectory' @ error/tiff.c/TIFFErrors/661.

but it still displays the TIFF and thumbnails it correctly.

The problem here isn't in Thumbor, it's in the file and how MediaWiki handles it. The file description page shows ‎(0 × 0 pixels, file size: 2.82 MB, MIME type: image/tiff, 0 pages) and MediaWiki doesn't even attempt to ask Thumbor for thumbnails for the file. Looking at https://commons.wikimedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=imageinfo&titles=File%3ATessie%20Reynolds%2002.tif&formatversion=2&iiprop=url%7Cmetadata shows MediaWiki can't get metadata for the image with the error

tiffinfo command failed: '/usr/bin/tiffinfo' '/tmp/localcopy_f2c442080d13.tif' 2>&1

Running tiffinfo myself produces some output, but throws a similar error and exits status 1 (as Aklapper previously noted):

$ tiffinfo Tessie_Reynolds_02.tif                                         
TIFF Directory at offset 0x2d1a3a (2955834)
  Subfile Type: (0 = 0x0)
  Image Width: 1309 Image Length: 1028
  Resolution: 300, 300 pixels/inch
  Bits/Sample: 8
  Compression Scheme: LZW
  Photometric Interpretation: RGB color
  Samples/Pixel: 3
  Rows/Strip: 3
  Planar Configuration: single image plane
  Software: Adobe Photoshop 7.0
  DateTime: 2013:08:21 09:16:16
  XMLPacket (XMP Metadata):
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<?adobe-xap-filters esc="CR"?>
<x:xapmeta xmlns:x="adobe:ns:meta/" x:xaptk="XMP toolkit 2.8.2-33, framework 1.5">
	<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:iX="http://ns.adobe.com/iX/1.0/">
		<rdf:Description about="" xmlns:xapMM="http://ns.adobe.com/xap/1.0/mm/">
			<xapMM:DocumentID>adobe:docid:photoshop:287cadbd-0a2d-11e3-974f-e9ef62941d48</xapMM:DocumentID>
		</rdf:Description>
		<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/"><xmp:CreatorTool>Adobe Photoshop 7.0</xmp:CreatorTool></rdf:Description></rdf:RDF>
</x:xapmeta>
***39 blank lines removed***                               
                                                         <?xpacket end='w'?>
  Photoshop Data: <present>, 6366 bytes
  EXIFIFDOffset: 0x3dc594
  Predictor: none 1 (0x1)
TIFFFetchDirectory: Can not read TIFF directory count.
TIFFReadCustomDirectory: Failed to read custom directory at offset 4048276.

It's always fun when I have to break out the hex editor. The specific file had an Exif IFD tag (0x8769) pointing to a byte offset where Exif metadata should be. That byte offset, 0x3DC594, was way past the end of the file at 0x2DB1B30 0x2D1B30. There wasn't any space between the known contents of the image for EXIF metadata to be hiding. I don't know where the malformed EXIF tag came from, but since there's no corresponding EXIF data to be found, I removed it.

I think MediaWiki should handle this better, but I don't know if there's a great way to do that. tiffinfo doesn't tell you that the directory it can't find is just EXIF metadata, and not another image that should be included. Trying to parse the tiffinfo output anyway and then validating that data instead of checking for a return value >0 might work. Falling back to ImageMagick might also be an option.

Did you accidentally leave out or add in a hex, or swap the values? 0x3DC594 (the offset you stated; note there are six hexadecimal digits) is way less than 0x2DB1B30 (the file size you stated; note it has 7 hexadecimal digits). Even if it's meant to be an offset from the TIFF directory (located at 0x2D1A3A per the tiffinfo output from your previous comment), that only gets up to an absolute offset of around 0x6ADFCE.

Aklapper renamed this task from Specific Thumbnail generation for TIF file not working anymore to Specific Thumbnail generation for a broken invalid TIF file not working anymore.May 16 2020, 4:11 PM

Another file File:SC_212246_Surrender_of_Japan,_Tokyo_Bay,_2_September_1945.tif is found to have the same issue.
Some error message when accessing generated thumbnails could be found at VP/T. I got a different error message here:

Request from [redacted] via cp1076 cp1076, Varnish XID 323258569
Upstream caches: cp1076 int
Error: 429, Too Many Requests at Sat, 26 Mar 2022 15:29:38 GMT

image.png (535×1 px, 111 KB)
image.png (457×769 px, 114 KB)