Page MenuHomePhabricator

Create a pool of NFS-less Toolforge Kubernetes workers
Closed, ResolvedPublic

Description

As more and more tools move to Build Service based images, we should provide some Kubernetes workers without NFS volumes mounted.

Details

TitleReferenceAuthorSource BranchDest Branch
jobs-api: bump to 0.0.263-20240222104806-5ddd710frepos/cloud/toolforge/toolforge-deploy!206project_1317_bot_df3177307bed93c3f34e421e26c86e38bump_jobs-apimain
deployment: Pin jobs-api pod to NFS-enabled workersrepos/cloud/toolforge/jobs-api!62taavimain-I0b1d43e2a173b39f145bbcbf5142ed169b9ea259main
Customize query in GitLab

Event Timeline

taavi triaged this task as Medium priority.Jan 25 2024, 12:57 PM

Change 992925 had a related patch set uploaded (by Majavah; author: Majavah):

[cloud/wmcs-cookbooks@main] Add worker-nfs Toolforge Kubernetes role/prefix

https://gerrit.wikimedia.org/r/992925

Change 992925 merged by jenkins-bot:

[cloud/wmcs-cookbooks@main] Add worker-nfs Toolforge Kubernetes role/prefix

https://gerrit.wikimedia.org/r/992925

project_1317_bot_df3177307bed93c3f34e421e26c86e38 opened https://gitlab.wikimedia.org/repos/cloud/toolforge/toolforge-deploy/-/merge_requests/206

jobs-api: bump to 0.0.263-20240222104806-5ddd710f

taavi moved this task from In Progress to Done on the Toolforge (Toolforge iteration 06) board.

So I added three non-NFS workers, tools-k8s-worker-102 to 104. So far they're being used by various infrastructure things, buildservice image-build pods, and a few tools with buildservice images. That's roughly what I'd expect, especially with only a few evictions from the NFS nodes this morning.

There was an issue with jobs-api this morning where it did not specify nodeSelector. In addition I filed T358203: [k8s] Add node anti-affinity topologySpreadConstraints to infrastructure components where relevant. Otherwise I think I'm pretty happy with how this turned out. We should re-visit the size of this pool after the next time we've had to restart all the NFS workers.