Page MenuHomePhabricator

'Unknown error: "unknown"' when uploading a specific PDF file
Closed, ResolvedPublic

Description

When uploading this PDF file: https://archive.org/details/TallapakaPadaSahityamFinalWithTitlePage (copy:

), UploadWizard spends a long time on the "Assembling" step and eventually fails with "Unknown error: "unknown".".

The problematic API response is very unhelpful:

{"warnings":{"result":{"*":"This result was truncated because it would otherwise  be larger than the limit of 12,582,912 bytes"}}}

Reported to me here: https://commons.wikimedia.org/wiki/User_talk:Matma_Rex#Again_facing_badparams_problem

Event Timeline

Restricted Application added projects: Multimedia, Internet-Archive. · View Herald TranscriptAug 26 2016, 12:36 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Sounds related: T86611: API does not fail gracefully when data is too large.

This is probably caused by PdfHandler's extraction of all the text into file metadata (and ApiUpload's desire to include the metadata in the result). This 20 MB PDF is pretty much just text (14,318 pages of it), so that's a lot of metadata. Presumably more than the arbitrary 12 MB limit for API results.

Restricted Application added a subscriber: Matanya. · View Herald TranscriptAug 26 2016, 1:16 PM
matmarex claimed this task.Aug 26 2016, 1:23 PM
matmarex triaged this task as Normal priority.
Anomie moved this task from Unsorted to Needs Code on the MediaWiki-API board.Aug 26 2016, 2:12 PM
Anomie added a subscriber: Anomie.

On the API side, the solution is probably T89971: ApiQueryImageInfo is crufty, needs rewrite. A mitigation might be to have UploadBase::getImageInfo() not include the metadata from ApiQueryImageInfo.

Change 306927 had a related patch set uploaded (by Bartosz Dziewoński):
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306927

Change 306927 merged by jenkins-bot:
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306927

Change 306951 had a related patch set uploaded (by Bartosz Dziewoński):
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306951

Change 306951 merged by jenkins-bot:
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306951

Mentioned in SAL [2016-08-29T13:59:32Z] <hashar@tin> Synchronized php-1.28.0-wmf.16/includes/api/ApiUpload.php: ApiUpload: Better handle unreasonably large metadata in 'imageinfo' T143993 (duration: 00m 46s)

matmarex closed this task as Resolved.Aug 29 2016, 2:02 PM
matmarex removed a project: Patch-For-Review.

Fixed and deployed.