Apparently, the Internet Archive is no longer generating Djvu files, so the folks on WikiSource don't have a web interface for generating them any more. We should investigate what would be involved in setting up a simple interface on Tool Labs for generating Djvu files from PDFs or a set of JPGs.
Description
Related Objects
Event Timeline
Change 330481 had a related patch set uploaded (by Pppery):
Canonicalize title before creating new newsletter
The IA Upload tool can be modified to do this, by incorporating the process that's defined in @Alex_brollo's jp2tojdvu.py.
See https://github.com/Tpt/ia-upload/issues/14
It uses a few external commands from djvuLibre, which is already available on Tool Labs.
This feature is ready for testing at https://tools.wmflabs.org/ia-upload/test/
The patch is at https://github.com/wikisource/ia-upload/pull/18 (not quite ready for review yet).
An issue was identified when the JP2 files had different filenames than were expected (e.g. containing [^a-zA-Z0-9]) and this is now fixed. It seems that the page names in DjVu XML are somewhat constrained.
I think the patch is ready to merge. Will wait for some more testing though — https://tools.wmflabs.org/ia-upload/test/
@Samwilson: What did you find out about storing user OAuth credentials? Is this kosher or not? @bd808 might be a good person to ask about it.
I asked on labs-l and they suggest using a database instead, but that a 0600 json file is still okay. The file is created like this:
$oldUmask = umask( 0177 ); touch( $jobFile ); umask( $oldUmask ); chmod( $jobFile, 0600 ); file_put_contents( $jobFile, \GuzzleHttp\json_encode( $jobInfo ) );
Fair enough :)
So I guess the answer to the investigation question is... Yes!
I'll create a follow-up task for actually finishing the merge/roll-out/documentation.