Page MenuHomePhabricator

toolforge-jobs should have a method to trigger a one-off cronjob run
Closed, ResolvedPublicFeature

Description

I suggest overloading the run subcommand to allow triggering a one-off cronjob run. That is, if foo is a registered cronjob, then toolforge-jobs run foo should run the job.

This is useful for testing purposes, invoking manual runs from the command line, invoking from another job (along with T315729), etc.

Event Timeline

JJMC89 changed the subtype of this task from "Task" to "Feature Request".Oct 30 2022, 3:56 PM

needed!

Perhaps it's better to use "restart" for this use case. The manpage say "Only continuous and cron jobs are supported" but it doesn't say what a restart of an idle cronjob means. Apparently it means nothing.

@aborrero

I'm confused, there is apparently some code to support this https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/toolforge/jobs-framework-api/+/refs/heads/main/tjf/ops.py#111 so if this doesn't work already then perhaps this is a bug!

@aborrero , @taavi - is there a workaround? can I use kubectl or something to start an already-configured but not running timed (cron) job ? thanks

Change 907434 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-api@main] ops/restart: allow triggering a non-running cronjob

https://gerrit.wikimedia.org/r/907434

Change 907435 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-api@main] ops: when restarting a job, don't delete previous one if no quota for next

https://gerrit.wikimedia.org/r/907435

@aborrero , @taavi - is there a workaround? can I use kubectl or something to start an already-configured but not running timed (cron) job ? thanks

I don't think there is an easy way using kubectl either.

I see now. Thanks.

Change 907435 abandoned by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-api@main] ops: when restarting a job, don't delete previous one if no quota for next

Reason:

see last -1 comment

https://gerrit.wikimedia.org/r/907435

Change 907434 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-api@main] ops/restart: allow triggering a non-running cronjob

https://gerrit.wikimedia.org/r/907434

Change 907817 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-api@main] restart: report exceptions

https://gerrit.wikimedia.org/r/907817

Change 907817 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-api@main] restart: report exceptions

https://gerrit.wikimedia.org/r/907817

aborrero claimed this task.

Changes added! This should be enabled now. Please give it a try and reopen this ticket if it doesn't work.

Kotz reopened this task as In Progress.EditedApr 12 2023, 3:58 PM

@aborrero

not working for me (on tools-sgebastion-10). do we need to wait for deploy?

BTW I added you as a maintainer to my bot. try become bothasava toolforge-jobs restart monthly. It takes an awful lot of time so you can kill it when done.

BTW2 I have watch toolforge-jobs list and I see this intermittedly - not sure if it is related or not

ERROR: An internal error occured while executing this command.
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/tjf_cli/api.py", line 167, in _make_request
    response.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 502 Server Error: Bad Gateway for url: https://api.svc.tools.eqiad1.wikimedia.cloud:30003/jobs/api/v1/list/

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/tjf_cli/api.py", line 60, in _make_http_error
    json = original.response.json()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 897, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python3/dist-packages/simplejson/__init__.py", line 518, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/usr/lib/python3/dist-packages/simplejson/decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/tjf_cli/cli.py", line 645, in main
    run_subcommand(args=args, api=api)
  File "/usr/lib/python3/dist-packages/tjf_cli/cli.py", line 601, in run_subcommand
    op_list(api, output_format)
  File "/usr/lib/python3/dist-packages/tjf_cli/cli.py", line 310, in op_list
    list = _list_jobs(api)
  File "/usr/lib/python3/dist-packages/tjf_cli/cli.py", line 305, in _list_jobs
    list = api.get("/list/").json()
  File "/usr/lib/python3/dist-packages/tjf_cli/api.py", line 175, in get
    return self._make_request("GET", url_path, **kwargs)
  File "/usr/lib/python3/dist-packages/tjf_cli/api.py", line 170, in _make_request
    new_error = _make_http_error(e)
  File "/usr/lib/python3/dist-packages/tjf_cli/api.py", line 72, in _make_http_error
    except requests.exceptions.InvalidJSONError:
AttributeError: module 'requests.exceptions' has no attribute 'InvalidJSONError'
ERROR: Please report this issue to the Toolforge admins: https://w.wiki/6Zuu
aborrero triaged this task as Medium priority.Apr 13 2023, 8:18 AM

Change 908589 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-api@main] ops_status: populate status from manually restarted cronjob

https://gerrit.wikimedia.org/r/908589

@aborrero

not working for me (on tools-sgebastion-10). do we need to wait for deploy?

Yes, we were expecting it to be deployed and working. I think you found an actual bug!

Patch https://gerrit.wikimedia.org/r/908589 should be the fix I hope.

BTW2 I have watch toolforge-jobs list and I see this intermittedly - not sure if it is related or not

Thanks. This seems unrelated. I've let that same command work here for hours and can't reproduce the backtrace though.

Change 908589 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-api@main] ops_status: populate status from manually restarted cronjob

https://gerrit.wikimedia.org/r/908589

This should be working now. Please reopen if required,