Page MenuHomePhabricator

Allow Kubernetes workers to be deployed on Bookworm
Open, Needs TriagePublic3 Estimated Story Points

Description

Hi folks!

After a long investigation between me Janis and Tobias for https://github.com/ROCm/k8s-device-plugin/issues/65 we ended up finding that the issue is the runc version available for Bullseye. The current issue is a syscall like access, used by Pytorch and other tools when initializing a GPU, ends up returning EPERM when it shouldn't, making the overall Python code erroring out (so we are not able to use the GPU).

I opened https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1071269 to ask if Debian upstream could release a new version for Bullseye, but migrating to Bookworm would also allow ML to progress T363191 so I started checking what is needed to allow Bookworm for Kubernetes workers.

Packages

The following are not available for bookworm-wikimedia:

  • kubernetes-node
  • calico-cni
  • istio-cni
  • calico
  • calicoctl
  • dragonfly-dfdaemon
  • dragonfly-dget

Meanwhile rsyslog-kubernetes is now shipped by Debian upstream, so we are good on that side. Everything is ran by golang and statically built, but I checked ldd to verify anyway.

  • The kubelet binary links against libc
  • kube-proxy, Calico binaries, Istio binaries are marked as not a dynamic executable
  • The dragonfly binaries have a longer list of dynamic libraries:
	linux-vdso.so.1 (0x00007ffd5c99d000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007ff1f87d4000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007ff1f85f3000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ff1f87e0000)

If I am not missing any package I could simply reprepro copy all packages to bookworm-wikimedia and test on ml-staging2001, rebuilding kubernetes-node or dragonfly packages if needed (namely libc incompatibility etc..).

Docker would move to a different version of course:

20.10.5+dfsg1-1+deb11u2 to 20.10.24+dfsg1-1+b320.10.24+dfsg1-1+b3

And runc as well:

1.0.0~rc93+ds1-5+deb11u3 to 1.1.5+ds1-1+deb12u1 (this version contains the fix that ML needs)

Puppet

Checked profile::kubernetes::node and related classes like k8s::kubelet, I don't see any Bullseye-specific bit from a quick glance, so in theory no changes are needed.

Kernel/OS/Misc

The kernel would go from 5.x to 6.x, I can't think about anything specific that would cause troubles except of course that all containers may run differently on 6.x, but we can't do much about it in advance.

Anything else that I am missing? Does the plan look ok?

Event Timeline

For T362408: Migration to containerd and away from docker we're planning to backport containerd from bookworm to bullseye. Maybe it would be feasible to backport runc as well (although this won't help you with T363191: Test if we can avoid ROCm debian packages on k8s nodes ofc.)?

ML would be very happy to test the 6.x kernel since the GPU drivers are shipped directly with it, so we'd get a nice bump to those as well. I forgot about containerd right, I'll wait for Alex's approval before doing anything.

@akosiaris o/ anything against the overall plan (namely copy to bookworm-wikimedia the packages) and/or concerns about containerd ?

calbon updated Other Assignee, added: klausman.
calbon set the point value for this task to 3.
calbon moved this task from Unsorted to Ready To Go on the Machine-Learning-Team board.

Change #1034521 had a related patch set uploaded (by Elukey; author: Elukey):

[operations/puppet@production] aptrepo: add k8s components to Bookworm Wikimedia

https://gerrit.wikimedia.org/r/1034521

Change #1034521 merged by Elukey:

[operations/puppet@production] aptrepo: add k8s components to Bookworm Wikimedia

https://gerrit.wikimedia.org/r/1034521

Mentioned in SAL (#wikimedia-operations) [2024-05-22T14:28:32Z] <elukey> copy calico, istio-cni, kubernetes-node packages from bullseye-wikimedia to bookworm-wikimedia - T365253

Mentioned in SAL (#wikimedia-operations) [2024-05-22T15:44:36Z] <elukey> upload to bookworm-wikimedia dragonfly-{dfdaemon,dfget}, calicoctl, calico-cni - T365253

I had to copy more packages (updated the task's description), but everything worked fine on ml-staging2001. The ML team is unblocked and can now test pods with GPUs (for LLM etc..), and they are also unblocked to test a new staging node that will arrive soon (that runs other newer GPUs etc..).

We can discuss this use case during the next SIG and decide what to do (rebuild or not etc..). Moreover, this use case is interesting in the context of the containerd migration, so better to discuss it before reaching production.

Dragonfly is an internally built golang package, it would be better if we properly rebuilt it on bookworm with current Go, otherwise we'd be unable to rebuild the deb if we need to fit something urgently. Also, this allows us to re-enable the support for sysusers.d which was once added, but had to be disabled for older distros.

Dragonfly is an internally built golang package, it would be better if we properly rebuilt it on bookworm with current Go, otherwise we'd be unable to rebuild the deb if we need to fit something urgently. Also, this allows us to re-enable the support for sysusers.d which was once added, but had to be disabled for older distros.

Definitely I'll make sure to do it, I copied the packages yesterday to unblock the reimage and test, but it was just a stopgap measure. Thanks for the feedback :)

I checked the dragonfly repo and I have a question about building for bookworm (didn't find it in https://wikitech.wikimedia.org/wiki/Dragonfly) - since we are going to have two os versions, how should we manage the master branch's debian changelog? Namely, should I create a new branch from master for bookworm, or do you prefer another road?

I checked the dragonfly repo and I have a question about building for bookworm (didn't find it in https://wikitech.wikimedia.org/wiki/Dragonfly) - since we are going to have two os versions, how should we manage the master branch's debian changelog? Namely, should I create a new branch from master for bookworm, or do you prefer another road?

I'd keep it simple and simply move master to bookworm, the legacy packages won't be updated any further and the dragonfly* super nodes will also need to be moved off buster soon.