Page MenuHomePhabricator

Allow upload of PDF files from IA identifer with IA-Upload tool.
Closed, DuplicatePublicFeature

Description

Whats the problem you want to solve?

Currently the tool provided by https://ia-upload.toolforge.org/ will use an Internet Archive identifer to grab or reconstruct a DJVU file from the raw scans provided.

In some circumstances however, IA already has PDF versions of the scanned work for a given identifer, and sometimes for consistency it's better to upload these , instead of an overly compressed DJVu to avoid quality issues for projects like Wikisource. It is not easy to do a direct upload of a PDF from IA, as Special:Upload at Commons, does not reliably cope with files that are larger than 100MB which PDF's of entire books are likely to be. ( URL2Commons, in personal experience also appears to have less than perfect reliablity when it comes to large files.)

What solutions have you tried ?
"Chunked" upload, which requires that an uploader like me makes a local copy of the file, and uses twice the bandwidth that a direct IA -> Wikimedia Commons upload would use. (one for the download from IA to local storage, and 1 for the upload to commons via Chunked upload). Whilst some users are on fast connections, others may not be. and hence elimination of what should be unecessary bandwith and local storage use would be desirable.

What is the functionality you would like?

Option in the IA-Upload tool, to use the identifier given, to upload the PDF version for a given identifer, instead of a Djvu (existing or reconstituted from indivdual Jpeg.), and to have that option be able to handle a large (>100MB) file using an appropriate approach, (Not sure if it's possible currently to have such uploads chunked when requesting it from IA in the first place. FTP style transfers at one time used to be able to do partial or "interrupted" transfers, but I'm not sure what IA's API actually supports.)

(Leaving upstream tag as it may involve functionality in the IA API, or on that organisations servers.)