Page MenuHomePhabricator

Shell error from "pdftotext" and "pdfinfo" commands (e.g. "Syntax Error: Invalid XRef entry")
Closed, DeclinedPublicPRODUCTION ERROR

Description

There has always been a trickle of these errors (dozens per day) but recently we have seen some big spikes (thousands): Syntax Error: Top-level pages object is wrong type (null), Syntax Error: Invalid XRef entry.

Those are xpdf error messages, so probably PdfHandler related. The errors are HHVM fatals (so no URL/stack trace).

Event Timeline

The errors do not correlate with the train (or each other, even). A spike that I checked randomly lasted for about three hours.

The errors tend to come in blocks like this:

Syntax Error: Invalid XRef entry
Syntax Error: Top-level pages object is wrong type (null)
Syntax Error: Top-level pages object is wrong type (null)
Command Line Error: Wrong page range given: the first page (1) can not be after the last page (0).
ArielGlenn triaged this task as Medium priority.May 2 2017, 9:39 AM
ArielGlenn subscribed.

There were similar spikes for both of these errors on April 5 and not much of anything since then.

Per T157646#3695779, these errors (if they still exist) have been removed from the hhvm channel. Once https://gerrit.wikimedia.org/r/#/c/385946/ is merged they will show up in the exec channel, with URL and everything.

Still seen. 11,000 hits in Logstash in the past 30 days in the exec channel that mention "XRef".

Here's a recent sample with the new logging that @Tgr mentioned above.

reqId: W6FUeQrAEHAAAIfZswUAAAAG
wiki: commons.wikimedia.org
url: /w/api.php
http_method: POST
referer: https://commons.wikimedia.org/wiki/Special:UploadWizard
-------

Error running /bin/bash '/srv/mediawiki/php-1.32.0-wmf.20/includes/shell/limit.sh' ''\''/usr/bin/pdfinfo'\'' '\''-enc'\'' '\''UTF-8'\'' '\''-l'\'' '\''9999999'\'' '\''/tmp/r3zgWT'\''' 'MW_INCLUDE_STDERR=;MW_CPU_LIMIT=50; MW_CGROUP='\''/sys/fs/cgroup/memory/mediawiki/job'\''; MW_MEM_LIMIT=1048576; MW_FILE_SIZE_LIMIT=524288; MW_WALL_CLOCK_LIMIT=180; MW_USE_LOG_PIPE=yes':

Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Invalid XRef entry
Syntax Error: Top-level pages object is wrong type (null)
Syntax Error: Top-level pages object is wrong type (null)
Command Line Error: Wrong page range given: the first page (1) can not be after the last page (0).
trace
#0 /srv/mediawiki/php-1.32.0-wmf.20/extensions/PdfHandler/includes/PdfImage.php(142): MediaWiki\Shell\Command->execute()
#1 /srv/mediawiki/php-1.32.0-wmf.20/extensions/PdfHandler/includes/PdfImage.php(57): PdfImage->retrieveMetaData()
#2 /srv/mediawiki/php-1.32.0-wmf.20/extensions/PdfHandler/includes/PdfHandler.php(298): PdfImage->getImageSize()
#3 /srv/mediawiki/php-1.32.0-wmf.20/includes/utils/MWFileProps.php(86): PdfHandler->getImageSize(FSFile, string, string)
#4 /srv/mediawiki/php-1.32.0-wmf.20/includes/upload/UploadStash.php(221): MWFileProps->getPropsFromPath(string, string)
#5 /srv/mediawiki/php-1.32.0-wmf.20/includes/upload/UploadBase.php(1124): UploadStash->stashFile(string, string)
#6 /srv/mediawiki/php-1.32.0-wmf.20/includes/upload/UploadFromChunks.php(124): UploadBase->doStashFile(User)
#7 /srv/mediawiki/php-1.32.0-wmf.20/includes/upload/UploadBase.php(1070): UploadFromChunks->doStashFile(User)
#8 /srv/mediawiki/php-1.32.0-wmf.20/includes/upload/UploadFromChunks.php(76): UploadBase->tryStashFile(User, boolean)
#9 /srv/mediawiki/php-1.32.0-wmf.20/includes/api/ApiUpload.php(315): UploadFromChunks->tryStashFile(User, boolean)
#10 /srv/mediawiki/php-1.32.0-wmf.20/includes/api/ApiUpload.php(212): ApiUpload->performStash(string)
#11 /srv/mediawiki/php-1.32.0-wmf.20/includes/api/ApiUpload.php(132): ApiUpload->getChunkResult(array)
#12 /srv/mediawiki/php-1.32.0-wmf.20/includes/api/ApiUpload.php(104): ApiUpload->getContextResult()
#13 /srv/mediawiki/php-1.32.0-wmf.20/includes/api/ApiMain.php(1587): ApiUpload->execute()
#14 /srv/mediawiki/php-1.32.0-wmf.20/includes/api/ApiMain.php(531): ApiMain->executeAction()
#15 /srv/mediawiki/php-1.32.0-wmf.20/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#16 /srv/mediawiki/php-1.32.0-wmf.20/api.php(87): ApiMain->execute()
Krinkle renamed this task from Spikes in "Syntax Error: Top-level pages object is wrong type (null)" and "Syntax Error: Invalid XRef entry" errors in production to Shell error from "pdftotext" and "pdfinfo" commands (e.g. "Syntax Error: Invalid XRef entry").Sep 19 2018, 3:50 PM
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM

It looks like this is expected when attempting to get info from a malformed PDF? I see that MediaWiki\Shell\Command::execute is logging stderr by default, but that's easy enough to switch off if this is just logspam.

Indeed. In general callers should (and do) handle shell errors, so if this didn't lead to an exception that means it was handled correclty and this is just a dianogstic message for investigation of other error before or after it.

This channel is no longer included in production error monitoring since last year as such. Given no other bug report from this, closing as such.