Page MenuHomePhabricator

PDF file has "0 x 0 pixels" and no thumbnail when shown on mediawiki.org but works fine on Commons
Closed, ResolvedPublic

Description

This file is not displaying properly at mediawikiwiki, but it looks fine on Commons.
I've tried purging the file page at Commons and Mediawikiwiki, and purging a page it is used on. That didn't help.
The only clue is the "0 x 0 pixels" metadata. Possibly that means it is related to T192866 ?

https://www.mediawiki.org/wiki/File:Mobile_Page_Issues_Research.pdf ‎(0 × 0 pixels, file size: 1.32 MB)
https://commons.wikimedia.org/wiki/File:Mobile_Page_Issues_Research.pdf (1,500 × 843 pixels, file size: 1.32 MB)

Missing thumbnail generation can be seen at https://www.mediawiki.org/w/index.php?title=User:CKoerner_(WMF)/sandbox&oldid=2765329 where the link on the right should actually be a normal thumbnail.

Event Timeline

I'm seeing 0x0 and broken thumbnails on Commons itself for a bunch of other recently uploaded PDF files, could be related issue with some machines missing something used for metadata loading?

https://commons.wikimedia.org/wiki/Special:NewFiles?user=&mediatype%5B%5D=OFFICE&start=&end=&wpFormIdentifier=specialnewimages&limit=50&offset=

This may be an issue with inconsistent versions of poppler-utils/pdfinfo installed, maybe? In local testing that with poppler 0.48.0 in Debian Stretch, it looks like the -meta (metadata) and -l 999999 (give sizes for pages up to 999999) seem to be at odds.

On the older 0.26.5 in Jessie it works ok.

# Stretch vagrant vm
vagrant@vagrant:/srv/images/5/59$ pdfinfo -enc UTF-8 -l 9999999 -meta  Mobile_Page_Issues_Research.pdf 
vagrant@vagrant:/srv/images/5/59$ pdfinfo -enc UTF-8 -l 9999999  Mobile_Page_Issues_Research.pdf 
Creator:        Google
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
Internal Error: xref num 3 not found but needed, try to reconstruct<0a>
JavaScript:     no
Pages:          21
Encrypted:      no
Page    1 size: 720 x 405 pts
Page    1 rot:  0
Page    2 size: 720 x 405 pts
Page    2 rot:  0
Page    3 size: 720 x 405 pts
Page    3 rot:  0
Page    4 size: 720 x 405 pts
Page    4 rot:  0
Page    5 size: 720 x 405 pts
Page    5 rot:  0
Page    6 size: 720 x 405 pts
Page    6 rot:  0
Page    7 size: 720 x 405 pts
Page    7 rot:  0
Page    8 size: 720 x 405 pts
Page    8 rot:  0
Page    9 size: 720 x 405 pts
Page    9 rot:  0
Page   10 size: 720 x 405 pts
Page   10 rot:  0
Page   11 size: 720 x 405 pts
Page   11 rot:  0
Page   12 size: 720 x 405 pts
Page   12 rot:  0
Page   13 size: 720 x 405 pts
Page   13 rot:  0
Page   14 size: 720 x 405 pts
Page   14 rot:  0
Page   15 size: 720 x 405 pts
Page   15 rot:  0
Page   16 size: 720 x 405 pts
Page   16 rot:  0
Page   17 size: 720 x 405 pts
Page   17 rot:  0
Page   18 size: 720 x 405 pts
Page   18 rot:  0
Page   19 size: 720 x 405 pts
Page   19 rot:  0
Page   20 size: 720 x 405 pts
Page   20 rot:  0
Page   21 size: 720 x 405 pts
Page   21 rot:  0
File size:      1385436 bytes
Optimized:      no
PDF version:    1.4
vagrant@vagrant:/srv/images/5/59$ pdfinfo -enc UTF-8  -meta  Mobile_Page_Issues_Research.pdf

Looks like that was reported by third-party users as T117839 some while back. It might just be hitting us now due to a package change?

Change 429356 had a related patch set uploaded (by Brion VIBBER; owner: Brion VIBBER):
[mediawiki/extensions/PdfHandler@master] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429356

@MoritzMuehlenhoff logs show poppler-utils was updated for security recently; if that involved updating some machines to a newer version from an old old version it might be triggering this. If so, the patch to PdfHandler above *should* resolve the issue with compatibility for both old and new poppler-utils version...

Looks like more Debian Stretch api & app servers got into the mix in the last few days, which probably explains it.

Some portion of upload requests will go through the Stretch machines, which'll fail to get metadata from pdfinfo, and then subsequent thumbnail operations on those files fail because they lack metadata.

I'll schedule a swat deploy for the hotfix.

Change 429358 had a related patch set uploaded (by Reedy; owner: Brion VIBBER):
[mediawiki/extensions/PdfHandler@wmf/1.32.0-wmf.1] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429358

Change 429359 had a related patch set uploaded (by Reedy; owner: Brion VIBBER):
[mediawiki/extensions/PdfHandler@REL1_31] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429359

Change 429358 merged by jenkins-bot:
[mediawiki/extensions/PdfHandler@wmf/1.32.0-wmf.1] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429358

Change 429362 had a related patch set uploaded (by Reedy; owner: Brion VIBBER):
[mediawiki/extensions/PdfHandler@REL1_30] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429362

Change 429363 had a related patch set uploaded (by Reedy; owner: Brion VIBBER):
[mediawiki/extensions/PdfHandler@REL1_29] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429363

Change 429364 had a related patch set uploaded (by Reedy; owner: Brion VIBBER):
[mediawiki/extensions/PdfHandler@REL1_27] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429364

Change 429356 merged by jenkins-bot:
[mediawiki/extensions/PdfHandler@master] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429356

Change 429359 merged by jenkins-bot:
[mediawiki/extensions/PdfHandler@REL1_31] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429359

Fix went live a couple hours ago, all files uploaded since then seem to be detecting metadata ok. May need to run a script to fix up the broken ones per-wiki, I'll look into that tomorrow.

In T193200#4162967, @brion wrote:

Looks like more Debian Stretch api & app servers got into the mix in the last few days, which probably explains it.

Some portion of upload requests will go through the Stretch machines, which'll fail to get metadata from pdfinfo, and then subsequent thumbnail operations on those files fail because they lack metadata.

Ack, that sounds like what happened here. Does that extension have CI tests, if so it would be nice if a test could be added which only tests the pdfinfo invocation against the expected output format. We usually start rolling CI tests for new distro releases before we upgrade our servers, so those tests would have caught this.

Change 429877 had a related patch set uploaded (by Brion VIBBER; owner: Brion VIBBER):
[mediawiki/extensions/PdfHandler@REL1_28] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429877

Change 429364 merged by jenkins-bot:
[mediawiki/extensions/PdfHandler@REL1_27] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429364

Change 429363 merged by jenkins-bot:
[mediawiki/extensions/PdfHandler@REL1_29] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429363

Change 429362 merged by jenkins-bot:
[mediawiki/extensions/PdfHandler@REL1_30] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429362

Change 429877 merged by Brion VIBBER:
[mediawiki/extensions/PdfHandler@REL1_28] Fix for pdfinfo changes in poppler-utils 0.48

https://gerrit.wikimedia.org/r/429877

Vvjjkkii renamed this task from PDF file has "0 x 0 pixels" and no thumbnail when shown on mediawiki.org but works fine on Commons to n4daaaaaaa.Jul 1 2018, 1:13 AM
Vvjjkkii removed brooke as the assignee of this task.
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
Wong128hk renamed this task from n4daaaaaaa to PDF file has "0 x 0 pixels" and no thumbnail when shown on mediawiki.org but works fine on Commons.Jul 1 2018, 6:15 AM
Wong128hk assigned this task to brooke.
Wong128hk raised the priority of this task from High to Needs Triage.
Wong128hk updated the task description. (Show Details)
Wong128hk added subscribers: gerritbot, Aklapper.

https://www.mediawiki.org/wiki/File:Mobile_Page_Issues_Research.pdf shows a thumbnail and correct size data nowadays.

Can this task be closed as resolved?

Aklapper removed a project: Patch-For-Review.

https://www.mediawiki.org/wiki/File:Mobile_Page_Issues_Research.pdf shows a thumbnail and correct size data nowadays.

Can this task be closed as resolved?

No reply, hence assuming yes. Feel free to reopen if there is work left to do or if this still happens somewhere.

Yann subscribed.

Hi, It just happens now: https://commons.wikimedia.org/wiki/File:The_Collected_Works_of_Mahatma_Gandhi,_vol._2.pdf
I get an error message saying "The wiki is in read-only mode."

TheDJ closed this task as Resolved.EditedFeb 4 2022, 11:35 PM
TheDJ subscribed.

Please do not reopen tickets which have been closed for a long time and instead file new tickets. Symptoms while similar can have very different causes and should thus be handles on a case by case basis.

Removing task assignee due to inactivity as this open task has been assigned for more than two years. See the email sent to the task assignee on August 22nd, 2022.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome!
If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!

TheDJ claimed this task.

This original issue was fixed. Separate problems sharing the same symptom should be filed as separate tickets.