Page MenuHomePhabricator

toolforge jobs load flushes out all jobs
Open, MediumPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Load a jobs file
  • Restart some jobs:
tools.multichill@tools-bastion-12:~$ toolforge jobs restart quality-image-add
tools.multichill@tools-bastion-12:~$ toolforge jobs restart sdoc-cc-by-4.0
tools.multichill@tools-bastion-12:~$ toolforge jobs restart sdoc-cc-zero
  • Check the jobs
tools.multichill@tools-bastion-12:~$ toolforge jobs list
+-------------------+------------------+------------------------------------------+
|     Job name:     |    Job type:     |                 Status:                  |
+-------------------+------------------+------------------------------------------+
| quality-image-add | schedule: @daily |             Running for 35s              |
|  sdoc-cc-by-4.0   | schedule: @daily |             Running for 20s              |
| sdoc-cc-by-sa-4.0 | schedule: @daily |           Running for 4h52m9s            |
|   sdoc-cc-zero    | schedule: @daily |              Running for 3s              |
| wikidata-uploader | schedule: @daily | Last schedule time: 2024-05-03T18:52:00Z |
+-------------------+------------------+------------------------------------------+
  • Load the same jobs file
toolforge jobs load jobs.yml
  • Check the jobs, all flushed:
tools.multichill@tools-bastion-12:~$ toolforge jobs list
+-------------------+------------------+----------------------------+
|     Job name:     |    Job type:     |          Status:           |
+-------------------+------------------+----------------------------+
| quality-image-add | schedule: @daily | Waiting for scheduled time |
|  sdoc-cc-by-4.0   | schedule: @daily | Waiting for scheduled time |
| sdoc-cc-by-sa-4.0 | schedule: @daily | Waiting for scheduled time |
|   sdoc-cc-zero    | schedule: @daily | Waiting for scheduled time |
| wikidata-uploader | schedule: @daily | Waiting for scheduled time |
+-------------------+------------------+----------------------------+

What happens?:
All jobs get flushed

What should have happened instead?:
Only the jobs that have been changed in the yml file should have been replaced

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

The jobs.yml is in /data/project/multichill on Toolforge

Event Timeline

dcaro triaged this task as Medium priority.May 6 2024, 8:17 AM
dcaro moved this task from Backlog to Ready to be worked on on the Toolforge board.
dcaro subscribed.

@Raymond_Ndibe what is the status? I see you merged something. I just tested and the problem still exists.

@Raymond_Ndibe what is the status? I see you merged something. I just tested and the problem still exists.

Yes. We are aware that the problem is still there. The issue is that there are a number of code execution paths that lead to the same problem (depending on the arguments defined in the load jobs yaml). We might have to move the load feature completely to the backend to be able to completely solve this issue. I am working on that

@Multichill this issue has been fixed. closing now. you can re-open if you notice something similar again

I just did a toolforge jobs list and had 3 jobs running.

tools.multichill@tools-bastion-12:~$ toolforge jobs load jobs.yml
.....
INFO: 35 job(s) loaded successfully

That seems to have killed these three running jobs.

please @Multichill share your jobs.yaml file so I can attempt reproducing this and see exactly what is happening

I investigated this a bit. I think the problem is coming from the replica field. For some reason I forgot to account for that in loads since it was added after the loads things was refactored. Also this should have been caught by our functional test