Page MenuHomePhabricator

Upgrade poppler-utils to at least 1.20
Closed, ResolvedPublic

Description

Nothing (only the symbol) displayed.

https://commons.wikimedia.org/wiki/Commons:Village_pump#PDF_thumbnailing_problem_File:GA20891.pdf

"""ugh maybe different mediawiki servers have different versions of pdfinfo installed, or something like cgroups broken on some not all. Bawolff (talk) 14:40, 11 October 2013 (UTC)"""


Version: wmf-deployment
Severity: normal

Details

Reference
bz55624

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:28 AM
bzimport set Reference to bz55624.
bzimport added a subscriber: Unknown Object (MLST).

If the cgroup theory is right then it should affect other things (like rendering <score> tags).

If either of those two theories are right, one should be able to correlate the occurance of pdf with 0x0 dimension with the served by mwXXX html comment in the page source directly after the purge.

Additionally if either of those two theories are correct, this should happen intermitantly to all pdfs after purge. If its just this pdf in particular, I have no idea what's going on.

(Sorry I don't really have time to investigate right now)

ah, affects only this file. strange...

Right now, https://commons.wikimedia.org/wiki/File:GA20891.pdf displays a thumbnail for me which is too narrow (for the 2013/09/12 version).
Not entirely sure how this looked before.

Which probably does not make this ops territory (anymore?) but PDF/thumbnail extractor stuff?

(In reply to comment #3)

Right now, https://commons.wikimedia.org/wiki/File:GA20891.pdf displays a
thumbnail for me which is too narrow (for the 2013/09/12 version).
Not entirely sure how this looked before.

Which probably does not make this ops territory (anymore?) but PDF/thumbnail
extractor stuff?

Originally I thought it was more than just that file, which would suggest ops territory as something wrong with running pdfinfo. That assumption was incorrect, so now its more still needs to be diagnosed.

Note its intermitantly saying the thumbnail is 0x0. The intermitent nature suggests ops territory, but hard to say for certain.

Ok, going back to the original version of the file, it looks like rotation wasn't detected properly. Which either means that there is a bug in pdfHandler, or wikimedia has the xpdf version of pdfinfo command line tool installed, instead of the poppler-utils package.

(In reply to comment #5)

Ok, going back to the original version of the file, it looks like rotation
wasn't detected properly. Which either means that there is a bug in
pdfHandler,
or wikimedia has the xpdf version of pdfinfo command line tool installed,
instead of the poppler-utils package.

In addition one would need at least version 0.20 of poppler-utils

[14:29] <Reedy> poppler-utils
[14:29] <Reedy> pdfinfo version 0.18.4
[14:29] <Reedy> poppler-utils 0.18.4-1ubuntu3.1

So our version of pdfinfo is not high enough to recognize rotated pages.

Also filed RT 6016 for upgrading poppler-utils.


I should note, this bug is a little confused as there was 2 issues. the rotation issue, and that the version of the file from 2013-09-12T07:12:35 was intermitently reporting 0x0 dimensions (aka probably could not run pdfinfo at all from only some servers). The pdfinfo upgrade thing is for the rotation issue. I have no idea what's up with the intermittent 0x0 thing, since the file allegedly had the same sha1 as the working versions.

Faidon's comment on RT:

"So, I gave this a try.

Backporting poppler 0.24 from saucy/trusty is almost impossible, due to a variety of complex build dependencies that would also need to be backported (at least Qt4 & Qt5 -- not fun at all).

Backport poppler 0.20 from Quantal seems a lot easier, however Quantal is only going to get security support until April 2014, i.e. trusty's release date, and it's unlikely we'll be able to move application servers from precise to trusty that soon at exactly the release date.

poppler is a software package that gets CVEs often for vulnerabilities that are relatively easy to exploit (someone uploading a malicious PDF) and would be high impact (appservers). I feel very reluctant to maintain it on our own in general, even more so an older version that noone supports or some backport of 0.20/0.24. It's not impossible, but it's certainly unpleasant.

Have you identified the patch that fixes the issue at hand? Maybe we could backport this specifically to precise's 0.18 as a stopgap until we move to trusty, sometime next year?"

[copied from RT]

As far as I can tell, the commit in question is: a0db250bbde ( http://cgit.freedesktop.org/poppler/poppler/commit/utils/pdfinfo.cc?id=a0db250bbdefff6361551cf9db344bd5268fea11 ).

The bug itself in poppler bug tracker says 0.20, however while looking for the commit number, I noticed the NEWS file said 0.19


[Not copied from RT]

At the end of the day, it is a very small number of files affected. If this turns out to be too complicated, it wouldn't be the worst thing in the world to wait a year. Although it would be nice to have it fixed.

fgiunchedi claimed this task.

see related, migration to trusty is complete on appservers