It seems they are not being cleaned up properly, and it's piling up.
The execution_sweep job should be cleaning those but it's failing it seems:
root@tools-harbor-1:~# docker logs -f harbor-jobservice | grep EXECUTION 2024-01-29T07:10:23Z [ERROR] [/jobservice/runner/redis.go:123]: Job 'EXECUTION_SWEEP:cca525ce26b4964cb78176f8' exit with error: run error: {"errors":[{"code":"UNKNOWN","message":"failed to delete executions: ERROR: update or delete on table \"execution\" violates foreign key constraint \"task_execution_id_fkey\" on table \"task\" (SQLSTATE 23503)"}]} 2024-01-29T07:10:23Z [INFO] [/jobservice/worker/cworker/c_worker.go:77]: Job incoming: {"name":"EXECUTION_SWEEP","id":"4cd4e18d8459011f54bc31ea","t":1706385630,"args":null} 2024-01-29T07:10:23Z [INFO] [/jobservice/runner/redis.go:197]: Retrying job EXECUTION_SWEEP:4cd4e18d8459011f54bc31ea, revision: 1706512223 2024-01-29T07:11:48Z [INFO] [/pkg/task/sweep_job.go:150]: [EXECUTION_SWEEP] start to sweep, retain latest 10 executions 2024-01-29T07:11:48Z [INFO] [/pkg/task/sweep_job.go:160]: [EXECUTION_SWEEP] listed 2 candidate executions for sweep 2024-01-29T07:11:48Z [INFO] [/pkg/task/sweep_job.go:180]: [EXECUTION_SWEEP] end to sweep, 2 executions were deleted in total, elapsed time: 46.454234ms 2024-01-29T09:40:06Z [ERROR] [/jobservice/runner/redis.go:123]: Job 'EXECUTION_SWEEP:4cd4e18d8459011f54bc31ea' exit with error: run error: {"errors":[{"code":"UNKNOWN","message":"failed to delete executions: ERROR: update or delete on table \"execution\" violates foreign key constraint \"task_execution_id_fkey\" on table \"task\" (SQLSTATE 23503)"}]} 2024-01-29T09:40:06Z [INFO] [/jobservice/worker/cworker/c_worker.go:77]: Job incoming: {"name":"EXECUTION_SWEEP","id":"9cd1eb3276592fab4cb9344a","t":1706389233,"args":null} 2024-01-29T09:40:06Z [INFO] [/jobservice/runner/redis.go:197]: Retrying job EXECUTION_SWEEP:9cd1eb3276592fab4cb9344a, revision: 1706521206
And the tables (in tools) are getting bigger and bigger:
harbor=> select count(*) from task; count -------- 265942 (1 row) harbor=> select count(*) from execution; count ---------- 13025405 (1 row)
Note that the job_log table is empty.
This might include cleaning up non-deleted retention policies from deleted projects:
harbor=> select count(*) from project as p join project_metadata as pm on p.project_id=pm.project_id where p.deleted='t' and pm.name='retention_id'; count ------- 3189 (1 row)
Info for the scheduled jobs - retention policies - schedules:
Tables involved:
- retention_policy - inside the data json column, you find .scope.level=="project" and .scope.ref that is the project id
- project - here you can check the deleted==t column to see if it's deleted
- project_metadata - this has column project_id being project.project_id and column name=="retention_id" with column value being the retention_policy.id
- schedule - this has column vendor_type="RETENTION" and json column callback_func_param with .PolicyID being the retention_policy.id
See {
}Info for the execution + task
- task - whith execution_id being the execution.id and vendor_type=="RETENTION"
- execution - where id is a foreign key to task.execution_id
DELETE from execution where execution.id not in (select execution_id from task); #### Changed to a more performant: ## Ran this several times in batches of 1000000 delete from execution where id in (select execution.id from execution left join task on task.execution_id = execution.id where task.execution_id is NULL limit 1000000);