Page MenuHomePhabricator

PDF file entirely rendered as a set of blank pages
Closed, ResolvedPublic

Description

https://commons.wikimedia.org/wiki/File:Os_Gatos_(v._5).pdf is rendered as a set of 308 entirely empty pages despite https://upload.wikimedia.org/wikipedia/commons/d/d1/Os_Gatos_%28v._5%29.pdf
containing a valid file

(exact duplicate of
http://purl.pt/11980/4/l-62535-p/l-62535-p_item4/l-62535-p_PDF/l-62535-p_PDF_24-C-R0150/l-62535-p_0000_capa-capa_t24-C-R0150.pdf
if any additional issues occurs and the file on Wikimedia repo gets broken between this report and someone taking care of this issue, which unfortunately is not impossible to happen, due to the very large backlogged reports on media rendereing/uploading that nobody cares)</sarcasm>

Event Timeline

555 raised the priority of this task from to Needs Triage.
555 updated the task description. (Show Details)
555 subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
matmarex set Security to None.

if any additional issues occurs and the file on Wikimedia repo gets broken between this report and someone taking care of this issue, which unfortunately is not impossible to happen, due to the very large backlogged reports on media rendereing/uploading that nobody cares

Have you experienced any actual data loss regarding uploaded files? That would be a much more severe problem and would not doubt receive Operations attention if reported.

The given PDF file thumbnails correctly for me locally, so I reckon it must be some issue with Wikimedia wikis' configuration. My local wiki runs on Ubuntu 15 and uses GPL Ghostscript 9.15 (2014-09-22) and ImageMagick 6.8.9-9 Q16 x86_64 2015-01-06 for PDF transforms.

@Krenair checked that we use Ghostscript 9.10 in production, and I verified that it fails to handle this document. I filed T110849 requesting that we upgrade it in WMF production; but to be honest, I have no idea how feasible that is.

A separate issue is that the failure to generate thumbnail is not detected, because Ghostscript doesn't correctly exit with a non-zero code on this failure. I'll submit a patch to detect this (which will change the situation from all-white thumbs to an error message indicating the failure).

Change 234918 had a related patch set uploaded (by Bartosz Dziewoński):
Detect failed transforms when Ghostscript doesn't report error via status

https://gerrit.wikimedia.org/r/234918

For future reference, here's the error that Ghostscript 9.10 produces:

openjpeg: failed to decode image!

   **** Warning: File has insufficient data for an image.
openjpeg: failed to decode image!

   **** Warning: File has insufficient data for an image.

   **** This file had errors that were repaired or ignored.
   **** The file was produced by:
   **** >>>> LuraDocument PDF v2.15 <<<<
   **** Please notify the author of the software that produced this
   **** file that it does not conform to Adobe's published PDF
   **** specification.

Since Ghostscript 9.15 handles it fine, I'm guessing that the "does not conform to Adobe's published PDF specification" assertion is incorrect, but the file is clearly funny in some way. Perhaps rebuilding it somehow using a different application will help, if you're looking for a workaround.

matmarex triaged this task as Medium priority.Aug 30 2015, 10:14 PM
matmarex removed a project: Wikimedia-Media-storage.

I suspect that this is Ghostscript bug 694837, in which case b2ecdbba02 contains the fix.

Change 234918 abandoned by Bartosz Dziewoński:
Detect failed transforms when Ghostscript doesn't report errors via exit status

Reason:
Per Ori, this could actually detect some mostly working thumbnails as failed. Thanks for the explanation.

https://gerrit.wikimedia.org/r/234918

The file renders correctly today.