Page MenuHomePhabricator

PDF containing jpx/jp2 encoded images from Archive.org uploaded to Commons won't thumbnail
Closed, ResolvedPublic

Description

I uploaded to the Wikimedia Commons the PDF "File:John Stuart Mill, Considerations on Representative Government (1st ed, 1861).pdf" (https://commons.wikimedia.org/wiki/File:John_Stuart_Mill,_Considerations_on_Representative_Government_(1st_ed,_1861).pdf) which I had downloaded from Archive.org, but none of the pages will thumbnail properly. I tried purging the page to no avail.

I reported this problem at the Commons Village Pump, and another editor said that when he tried to view the thumbnail at https://upload.wikimedia.org/wikipedia/commons/thumb/7/70/John_Stuart_Mill%2C_Considerations_on_Representative_Government_%281st_ed%2C_1861%29.pdf/page1-76px-John_Stuart_Mill%2C_Considerations_on_Representative_Government_%281st_ed%2C_1861%29.pdf.jpg he encountered this error message:

"Error creating thumbnail: convert: no decode delegate for this image format `/tmp/magick-hg7YMuoz' @ error/constitute.c/ReadImage/532."
"convert: missing an image filename `/tmp/transform_b1c9d0271ec9-1.jpg' @ error/convert.c/ConvertImageCommand/3011."


Version: wmf-deployment
Severity: normal

Details

Reference
bz59975

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 2:40 AM
bzimport set Reference to bz59975.
bzimport added a subscriber: Unknown Object (MLST).
Lupo added a comment.Jan 12 2014, 8:47 PM

That PDF contains jpx/jp2 encoded images. Presumably ImageMagic's convert lacks a JPEG 2000 decoder.

Lupo added a comment.Jan 12 2014, 9:08 PM

Or ImageMagick lacks the "advanced" JPEG2000 stuff. I see that is listed at https://en.wikipedia.org/wiki/JPEG_2000#Application_support as having only "basic" JPEG2000 support. The OpenJPEG library is listed as having the advanced stuff, but that was added to ImageMagick just a few days ago: http://www.imagemagick.org/script/changelog.php (2013-12-30).

(In reply to comment #2)

Or ImageMagick lacks the "advanced" JPEG2000 stuff. I see that is listed at
https://en.wikipedia.org/wiki/JPEG_2000#Application_support as having only
"basic" JPEG2000 support. The OpenJPEG library is listed as having the
advanced
stuff, but that was added to ImageMagick just a few days ago:
http://www.imagemagick.org/script/changelog.php (2013-12-30).

Its probably the old-ish version of ghost script we use's fault. gs converts to a jpeg file first, and then we use image magick to resize. gs would be the program responsible for interpreting the JPEG2000 stuff. The error message would mention convert, because convert would choke on the lack of input from gs erroring out (And no ghostscript error output as gs doesn't have its stderr redirected, only convert does, which is probably a mistake)

(And no ghostscript error output as gs doesn't have its stderr
redirected, only convert does, which is probably a mistake)

Making that side issue into bug 59986.

Locally, gs complains about invalid jpx blocks, but then ignores them and renders the image.

My local (fairly old) gs version is 8.71.

(And no ghostscript error output as gs doesn't have its stderr
redirected, only convert does, which is probably a mistake)

I misread the source code, gs errors should have been redirected, which implies that gs is exiting with no output and no errors, which is odd.

Maybe someone could run gs without the -q option to see if that makes a difference (but on my isntall, -q doesn't affect stderr output...). Also, anyone know the version of gs used on the servers?

Lupo added a comment.Aug 14 2014, 6:36 AM

Another example: [[:commons:File:Geneva Convention 1864 - CH-BAR - 29355687.pdf]]

Also contains jp2-encoded page images and won't thumbnail. File is nearly 100MB and contains 8 pages.

fgiunchedi added a subscriber: fgiunchedi.

looks like thumbs for that pdf are still failing to generate, however IIRC jobrunners aren't HHVM (and thus trusty) yet, is this fixed in newer upstream versions ?

fgiunchedi set Security to None.

looks like thumbs for that pdf are still failing to generate, however IIRC jobrunners aren't HHVM (and thus trusty) yet, is this fixed in newer upstream versions ?

Image scalars aren't updated yet which is the part that needs to be for the new version to be used. See T84842. However I do not know if the updated version will fix this bug.

Steinsplitter moved this task from Incoming to Backlog on the Commons board.Jun 17 2015, 11:47 AM
Jdforrester-WMF moved this task from Untriaged to Backlog on the Multimedia board.Sep 4 2015, 6:27 PM
Restricted Application added subscribers: Steinsplitter, Matanya, Aklapper. · View Herald TranscriptSep 4 2015, 6:27 PM
matmarex closed this task as Resolved.Oct 16 2015, 5:20 PM
matmarex claimed this task.
matmarex added a subscriber: matmarex.

The file appears to be thumbnailing correctly today.

matmarex updated the task description. (Show Details)Oct 16 2015, 5:20 PM