Page MenuHomePhabricator

'Unknown error: "unknown"' when uploading a specific PDF file
Closed, ResolvedPublic

Description

When uploading this PDF file: https://archive.org/details/TallapakaPadaSahityamFinalWithTitlePage (copy:

), UploadWizard spends a long time on the "Assembling" step and eventually fails with "Unknown error: "unknown".".

pasted_file (943×1 px, 126 KB)

The problematic API response is very unhelpful:

{"warnings":{"result":{"*":"This result was truncated because it would otherwise  be larger than the limit of 12,582,912 bytes"}}}

Reported to me here: https://commons.wikimedia.org/wiki/User_talk:Matma_Rex#Again_facing_badparams_problem

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Sounds related: T86611: API does not fail gracefully when data is too large.

This is probably caused by PdfHandler's extraction of all the text into file metadata (and ApiUpload's desire to include the metadata in the result). This 20 MB PDF is pretty much just text (14,318 pages of it), so that's a lot of metadata. Presumably more than the arbitrary 12 MB limit for API results.

matmarex triaged this task as Medium priority.
Anomie subscribed.

On the API side, the solution is probably T89971: ApiQueryImageInfo is crufty, needs rewrite. A mitigation might be to have UploadBase::getImageInfo() not include the metadata from ApiQueryImageInfo.

Change 306927 had a related patch set uploaded (by Bartosz Dziewoński):
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306927

Change 306927 merged by jenkins-bot:
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306927

Change 306951 had a related patch set uploaded (by Bartosz Dziewoński):
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306951

Change 306951 merged by jenkins-bot:
ApiUpload: Better handle unreasonably large metadata in 'imageinfo'

https://gerrit.wikimedia.org/r/306951

Mentioned in SAL [2016-08-29T13:59:32Z] <hashar@tin> Synchronized php-1.28.0-wmf.16/includes/api/ApiUpload.php: ApiUpload: Better handle unreasonably large metadata in 'imageinfo' T143993 (duration: 00m 46s)

matmarex removed a project: Patch-For-Review.

Fixed and deployed.