Just as many other wikisource users I appreciate a lot Internet Archive digitalization service, and I use it as deeply as I can (djvu files being only one from many uses of the rich file set that can be downloaded: collection of high-resolution jp2 images and abbyy xml being really extremely interesting).
I'd like that mediawiki should implement a similar digitalizing environment, but with a wiki approach and a wikisource-oriented philosophy, to share the best possible applications to pre-OCR jobs of book page images (splitting, rotating, cropping, dewrapping... in brief, "scantailoring" images), saving excellent lossless images from pre-OCR work; then the best possible OCR should be done, with ABBYY OCR engine or similar software if any, saving both text and full-detail OCR xml; then excellent images and best possible OCR text should be used to produce excellent seachable pdf and djvu files; finally - and this step would be really "wiki" - embedded text should be fixed by usual user revision work done into wikisource.
This is a bold dream; a less bold idea is, to get full access to best, heavy IA files (jp2.zip and abbyy xml) and to build tools for use them as thoroughly as possible.
--Alex brollo (talk) 07:08, 11 November 2015 (UTC)
This card tracks a proposal from the 2015 Community Wishlist Survey: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey
This proposal received 19 support votes, and was ranked #44 out of 107 proposals. https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Wikisource#To_implement_a_Internet_Archive-like_digitalization_service