Page MenuHomePhabricator

toolforge-jobs: Allow specifying arguments to commands
Closed, ResolvedPublicBUG REPORT

Description

  • It looks like it’s not possible to pass arguments to the command being executed; I assume we’re expected to put any nontrivial commands into a shell script and use the shell script as the command?

You can add it in quotes, like --command "./my-command.py foo bar".

I tried that and it didn’t work:

starting container process caused: exec: "echo hi": executable file not found in $PATH: unknown'.

I think we need to modify this line to not pass everything as one argument.. I'm not sure if just splitting with a space would break some use cases, another option would be to launch a shell to parse them.

Event Timeline

I think just splitting on spaces would do more harm than good. A shell would probably be a viable option.

How about changing the command line interface? Use positional arguments instead of --command:

toolforge-jobs run --image tf-python37 --name mycommand ./my-command.py foo bar --baz

Any positional arguments after the first non-option argument would be part of the command (even if they start with a hyphen); --name could default to the first positional argument (the command name).

Trying to do shell escaping and arg splitting in hand built python code has a number of problems. shlex.split is one option to avoid hand rolling, but I do like some variation on what @LucasWerkmeister said in T286107#7195621.

The easiest way to get argparse to treat the rest of the line as positional args to collect in a list is to type a literal -- in the argument stream to tell the parser that the rest of the line should be treated as positional arguments (https://docs.python.org/3/library/argparse.html#arguments-containing). The use of a -- argument as a context splitter like this is also a feature of the posix shell spec, but it feels like a thing that many folks who are fairly experienced unix shell users don't know (I've been asked questions that this was the correct answer to in job interviews).

There are other things that could be tested to see how well they work in practice. One would be using ArgumentParser.parse_known_args to extract the known named and positional args and collect the rest of the cli information into a 'remaining args' list (https://docs.python.org/3/library/argparse.html#partial-parsing).

For the record, I identified the problem early in the development of the framework. It was a conscious decision to leave it out in the first iteration because I wanted to collect clear use cases and expectations.
I was hoping that the wrapper approach that is suggested in wikitech was enough to cover basically all cases.

Allowing arguments is opening the door for the framework to play string games (something to avoid in my experience). What about arguments with multi-space strings, different quotes, variables, and other kind of shell expansion, etc... It can get very complex quickly. Isn't it best to just ask users to create a simple wrapper to hide arguments?

Another way (though maybe too clumsy) would be allowing passing the value in json format (or yaml or any other serialization). That would look like:

toolforge-jobs run --image tf-python37 --name mycommand --command '["./my-command", "foo", "bar", "--baz", "with spaces"]'

Not very nice, but useful and solves the quoting issues (no shell expansion though).

I think that the cleanest though is as @bd808 said the "--" trick, and though it's not so well known, as long as it's mentioned in the help and/or the docs should be easy to discover. As a cherry on top, might be nice to throw a message like "did you miss the -- when specifying the command?" if the options are not understood.

For the record, I identified the problem early in the development of the framework. It was a conscious decision to leave it out in the first iteration because I wanted to collect clear use cases and expectations.
I was hoping that the wrapper approach that is suggested in wikitech was enough to cover basically all cases.

I suspected as much, which is why I phrased my question in T285944#7194769 the way I did. But I still think it would be valuable to support commands with arguments – a wrapper script seems more cumbersome for quick experimentation (either you enter and leave your editor all the time, or you have to set up two shells with tmux or something else).

Change 704135 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-api@main] jobs: include option to log job stdout/stderr to the each user home directory

https://gerrit.wikimedia.org/r/704135

Change 704135 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-api@main] jobs: include option to log job stdout/stderr to the each user home directory

https://gerrit.wikimedia.org/r/704135

Change 704776 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-cli@master] toolforge-jobs: document command arguments

https://gerrit.wikimedia.org/r/704776

Change 704776 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-cli@master] toolforge-jobs: document command arguments

https://gerrit.wikimedia.org/r/704776

aborrero claimed this task.