Page MenuHomePhabricator

Design the Jobs service in k8s
Closed, ResolvedPublic

Description

We've found that specific settings and possible even alpha features (ttlSecondsAfterFinished, specifically) are needed to make the jobs/cronjobs work well.

We likely also need some tooling (new backend for jsub or a new command)!

Related Objects

StatusSubtypeAssignedTask
Resolved JHedden
Resolved aborrero
Resolved aborrero
Resolved aborrero
Resolved Bstorm
Resolved aborrero
Resolved aborrero
ResolvedJJMC89
Resolved aborrero
ResolvedBUG REPORT aborrero
ResolvedBUG REPORT aborrero
ResolvedBUG REPORT aborrero
ResolvedFeature aborrero
ResolvedBUG REPORT aborrero
ResolvedFeature aborrero
Resolved aborrero
Resolved aborrero
Resolved Bstorm
InvalidBUG REPORTNone
ResolvedFeature aborrero
Resolvedtaavi
ResolvedBUG REPORTSo9q
Resolved aborrero
ResolvedFeatureRaymond_Ndibe
ResolvedFeatureRaymond_Ndibe
DuplicateNone
DuplicateFeatureRaymond_Ndibe
ResolvedBUG REPORTJJMC89
ResolvedBUG REPORTRaymond_Ndibe
ResolvedBUG REPORT aborrero
ResolvedBUG REPORTtaavi
ResolvedFeaturetaavi
ResolvedFeaturetaavi
DuplicateBUG REPORTNone
DuplicateFeatureNone
Resolvedtaavi
ResolvedRaymond_Ndibe
Resolved aborrero

Event Timeline

Bstorm removed JHedden as the assignee of this task.
Bstorm triaged this task as High priority.
Bstorm removed a project: Tools.
Bstorm edited subscribers, added: JHedden; removed: AntiCompositeNumber.

The conversation in T251027: "signatures" tool has failed job pods on Kubernetes cluster is a good touchpoint for this ticket.

Probably the most simple note here is that batch objects don't appear to be quota-able directly (just the pods they schedule), which may or may not merit controllers to do just that.

Bstorm added a subscriber: aborrero.

I think you are working on this @aborrero, so assigning to you to prevent duplication. Feel free to delete if you have another one.

Change 681424 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-api@main] tjf: include deployment configuration for toolsbeta kubernetes

https://gerrit.wikimedia.org/r/681424

Just a note, that will have significant changes with Kubernetes 1.18/1.19.

Change 681725 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] toolforge: nginx-ingress-jobs: specify ingress-class

https://gerrit.wikimedia.org/r/681725

Change 692633 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/toolforge/jobs-framework-api@main] jobs-framework-api: introduce initial docker image

https://gerrit.wikimedia.org/r/692633

Change 692633 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-api@main] jobs-framework-api: introduce initial docker image

https://gerrit.wikimedia.org/r/692633

Change 681424 merged by Arturo Borrero Gonzalez:

[cloud/toolforge/jobs-framework-api@main] jobs-framework-api: include deployment configuration for toolsbeta kubernetes

https://gerrit.wikimedia.org/r/681424

Closing this task as the design phase is mostly completed. Now moving on to the devel phase.