Page MenuHomePhabricator

Wikisource: Internet Archive Upload Fail
Open, Needs TriagePublic

Description

Background: Hi fellows. Can you help me with this? I received this error message when I try upload a djvu using the first google page removal.
An error occurred: Command not found: "djvm -d "/data/project/ia-upload/ia-upload/jobqueue/worksrobertging21ingegoog/worksrobertging21ingegoog.djvu" 1 2>&1"

An error occurred: Command not found: "djvm -d "/data/project/ia-upload/ia-upload/jobqueue/worksrobertging02unkngoog/worksrobertging02unkngoog.djvu" 1 2>&1"

Resources:

Event Timeline

ifried renamed this task from Upload fail to Internet Archive Upload Fail.Oct 15 2020, 10:10 PM
ifried updated the task description. (Show Details)
ifried updated the task description. (Show Details)
ifried renamed this task from Internet Archive Upload Fail to Wikisource: Internet Archive Upload Fail.Oct 15 2020, 10:49 PM
ifried subscribed.

@Ixocactus Thanks for reporting this issue! Can you share which file you were trying to download, so we can perhaps try to reproduce the issue? Thanks!

Thank you, @Ixocactus & @AntiCompositeNumber! This information was helpful.

The Community Tech team is currently focusing on Wikisource wishes that are directly related to the 2020 wishes, which doesn't include the IA Upload tool. However, we hope that a volunteer developer can look into this issue. Thank you!

Hmm. Based on this and a few other recent failures, I'm starting to wonder if php-exec-command (which is the Command::exec(); wrapper ia-upload is using to execute binaries) is broken and returning "Command not found" for any non-zero exit status.

It makes no sense that the "djvm" command should randomly be missing for some files, but, looking at other failures, it is entirely as expected that a command line invocation consisting of the command name followed by 1k+ long file names with fully qualified path should hit the command line length limit and return a non-zero status. However, this too is currently being reported as "Command not found".

Iff this guess is correct, then it is impossible to debug this (and a few related Phabs) until that problem is found and fixed: right now the logs and error messages are lying to us.

Would it be possible to test out executing the relevant commands with just plain PHP exec() and logging the raw status instead? The relevant code is in IaClient:removeFirstPage() and isn't itself using any fancy features (not sure what using php-exec-command wins us here to begin with, but…).

The file from the original report seems to reliably reproduce this problem, so for anyone with access to a functioning instance of ia-upload (test or prod, for this it's not that critical) it should be within a reasonable time budget to try out (a permanent fix is another matter of course).