Special consideration needed for toolforge-jobs when performing kubernetes cluster upgrades?
Closed, InvalidPublic
Actions

Assigned To

None

Authored By

	taavi
	Oct 11 2021, 11:43 AM

Description

Previously most workloads on the Kubernetes cluster have been web services and other continuous jobs where a restart and a move to another node would not have mattered. This assumption changes when the jobs framework introduces cron jobs. This task is to:

check if running jobs will not misbehave when they are restarted
- TODO: should jobs reschedule or not if they don't complete? especially relevant for one-off jobs
consider adding some delay to let running jobs complete when nodes are being drained

Related Objects

Mentioned Here: T133598: Decide on upgrade policy for Kubernetes

Event Timeline

taavi created this task.Oct 11 2021, 11:43 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 11 2021, 11:43 AM

aborrero triaged this task as Medium priority.Oct 11 2021, 11:45 AM

taavi updated the task description. (Show Details)Oct 11 2021, 11:45 AM

taavi updated the task description. (Show Details)Oct 11 2021, 6:52 PM

bd808 renamed this task from toolforge-jobs and kubernetes cluster upgrades to Special consideration needed for toolforge-jobs when performing kubernetes cluster upgrades?.May 31 2022, 8:12 PM

bd808 edited projects, added Toolforge Jobs framework; removed Toolforge.

fnegri edited projects, added cloud-services-team; removed cloud-services-team (Kanban).Jan 18 2023, 6:45 PM

fnegri moved this task from Kanban to Inbox on the cloud-services-team board.

aborrero removed a parent task: T285944: Toolforge: beta phase for the new jobs framework.Jan 24 2023, 4:37 PM

We have been running jobs already on k8s for a while without issues, I think this does not apply anymore.

Please reopen if it's not the case.

Special consideration needed for toolforge-jobs when performing kubernetes cluster upgrades?Closed, InvalidPublicActions

Description

Related Objects

Event Timeline

Special consideration needed for toolforge-jobs when performing kubernetes cluster upgrades?
Closed, InvalidPublic
Actions