Page MenuHomePhabricator

production k3s cluster is approaching pod limit
Closed, ResolvedPublic3 Estimated Story Points

Description

Incident

On a recently deploy of patchdemo, the deployment stalled trying to schedule the new pods. We saw the following error:

Warning  FailedScheduling  41m   default-scheduler  0/2 nodes are available: 1 Too many pods. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod..

We were able to resolve this issue by deleting all of the patchdemo-staging environments, and catalyst environments that were more than two weeks old and named test-. (Very sorry if you were using those 😅 )

Configured (default) limits

Our current pod limit for the main k3s node is set to the default recommended limit of 110

kubectl get node k3s -ojsonpath='{.status.capacity.pods}'
110

Implications

When we reach the 110 limit (again), new pods will not be scheduled until some pods are removed. This means that:

  • patchdemo or catalyst cannot be upgraded/deployed
  • catalyst will be unable to create new demos
  • scheduled jobs (like the repo-pool updater and the expiry checker) will not run

Possible remediation

While this limit can be increased, the kubernetes documentation recommends against this, and recommends adding a new node to the cluster instead.

Event Timeline

SDunlap triaged this task as High priority.Aug 7 2025, 8:22 PM

We can see the number of pods on each node by running:

kubectl get pods -A -o=custom-columns=NODE:.spec.nodeName | sort | uniq -c | sort -n
      1 NODE
      2 k3s-envdb
     84 k3s
jnuche subscribed.

We currently have enough resources to add a new worker node. I was going to wait until completion of T396897 before creating it, but it seems this is the right time to do it

@jnuche and I talked about deploying an node manually (because the opentofu work is still a ways out). Jaime has deployed another node to our horizon project, but it is currently cordoned because we still need to get the repo pool populated on that node (because we are using the local patch storage provisioner. This means the repo-pool jobs will need to run on each node.

jnuche closed this task as Resolved.EditedAug 13 2025, 12:56 PM

New node k3s-worker01 is now operational.

I could successfully deploy a wiki environment on it:

wiki-189e7ae39e-1676-mediawiki-6d5f947f4c-p5qpt   2/2     Running    0             106s    10.42.2.9     k3s-worker01   <none>           <none>
jnuche set the point value for this task to 3.