Page MenuHomePhabricator

[ia-upload] Don't allow conversion from JP2 if there isn't a jp2.zip
Closed, ResolvedPublic3 Story Points

Description

Like this one (has DjVu and PDF, but no JP2): https://archive.org/download/youthfulchristi00unkngoog

This is recorded in the logs as LOG.CRITICAL: Zip file not found.

We should check for the existence of the zip file before allowing people to save the job to the queue. Or, better: we could look for other images such as _tif.zip and extract accordingly.

This probably only affects older IA items, because I think the modern system is to create a standard JP2 zip for everything.

Event Timeline

Restricted Application added a project: Internet-Archive. · View Herald TranscriptFeb 17 2017, 6:04 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Samwilson moved this task from Backlog to IA Upload on the Wikisource board.Feb 17 2017, 6:04 AM
Samwilson renamed this task from [ia-upload] Don't allow conversion from JP2 if there isn't a jp2.zip #28 to [ia-upload] Don't allow conversion from JP2 if there isn't a jp2.zip.Feb 20 2017, 6:53 AM
Samwilson triaged this task as Normal priority.
Samwilson updated the task description. (Show Details)
Samwilson edited projects, added IA Upload; removed Wikisource.May 25 2017, 11:28 AM
kaldari lowered the priority of this task from Normal to Low.Jun 13 2017, 11:54 PM
kaldari set the point value for this task to 3.

The error message pops out too in cases where _jp2.zip file exists, but its prefix is different from IA ID.

Can the log message error be edited, and forced to tell precisely what _jp2.zip is not found, t.i. printing the full name of _jp2.zip searched file?

Alex

Yes, definitely can throw a better error, good idea.

Do you have an example of one where it's failing but does have a _jp2.zip?

Alex_brollo added a comment.EditedNov 3 2017, 7:51 AM

The large majority of them have a _jp2.zip. I found only one item (an old IA upload) that fails because there's a _tiff.zip and a _jp2.zip is lacking - I presume that in that case the problem could be solved by IA uploader/IA sysop simply deleting _tiff.zip file into the item and deriving the item again.

Here the first IA item with LOG.CRITICAL: Zip file not found having a _jp2.zip file: 026BarettiLaSceltaDelleLettereFamiliariSi032

As you can see from IA file list, IA ID is

026BarettiLaSceltaDelleLettereFamiliariSi032

while _jp2.zip prefix is

026_Baretti_La_scelta_delle_lettere_familiari_si032
Samwilson claimed this task.
Samwilson closed this task as Resolved.Nov 29 2017, 2:12 AM

Merged and tested.