Page MenuHomePhabricator

[k8s] Add node anti-affinity topologySpreadConstraints to infrastructure components where relevant
Open, In Progress, MediumPublic

Description

T355883: Create a pool of NFS-less Toolforge Kubernetes workers introduced a new type of workers. As the number of them is relatively low, and as most of our infrastructure components do not have NFS access, the risk of all of the pods in a given deployment ending up on the same node is higher than I'd like. For that reason(*) we should tell the Kubernetes scheduler to spread them to different nodes if possible.

Since Kubernetes 1.19 the best way to do this is with topologySpreadConstraints on the kubernetes.io/hostname field: https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/.

(*): It'd always been a good idea, but now the risk of this causing issues is much higher.

Details

TitleReferenceAuthorSource BranchDest Branch
[builds-api] add topologySpreadConstraints to deploymentrepos/cloud/toolforge/builds-api!82raymond-ndibeadd_topology_spread_constraintsmain
Customize query in GitLab

Event Timeline

taavi triaged this task as Medium priority.Feb 22 2024, 11:19 AM
taavi created this task.
dcaro renamed this task from Add node anti-affinity topologySpreadConstraints to infrastructure components where relevant to [k8s] Add node anti-affinity topologySpreadConstraints to infrastructure components where relevant.Mar 5 2024, 5:13 PM
Slst2020 changed the task status from Open to In Progress.Apr 4 2024, 6:51 AM
Slst2020 moved this task from Next Up to In Progress on the Toolforge (Toolforge iteration 08) board.