Per T269684 we need to move away from docker. In February 2024, the serviceops team announced the results of the evaluation of the candidate replacement engines. Results and criteria have been documented in Kubernetes/CRE. The chosen container runtime engine was containerd. This task describes the plan for the migration and tracks the migration process itself
Plan
containerd upgrade
- Package and build containerd from bookworm for bullseye. The reason for this is various configuration directives that exist in the version in bookworm are referenced in the kubernetes upstream docs. See https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd. It also makes the migration to 1.25 kubernetes (T341984) a tad easier.
- Upload package to apt.wikimedia.org, upgrade staging nodes
- Upgrade all clusters to the newer containerd.
- Create puppetization for the configuration required by kubernetes
We 'll probably need a new profile profile::containerd or similar
Note: The upgrade process above doesn't require any kind of feature gating or flagging in puppet but rather just a deb-deploy manifest.
nerdctl
Docker has a relatively user friendly CLI. containerd doesn't. the ctr tool it ships with is a lower level, albeit useful tool. nerdctl, is a CLI released by the containerd project that is CLI compatible with docker CLI
- Package nerdctl. Probably utilizing our Upstream binaries policy to avoid the onus of having to build every since dependency
- Use puppet to install the package and populate a nerdctl configuration file /etc/nerdctl/nerdctl.toml to default to namespace k8s.io
- Test and approve.
Kubelet (the above are a prereq)
- Amend puppet to have behind a feature flag the following 2 parameters
--container-runtime-endpoint=unix:///run/containerd/containerd.sock --container-runtime=remote
Perform the migration in locksteps
Roughly for every batch of nodes
- Drain the nodes using kubectl drain --ignore-daemonsets=true --delete-emptydir-data=true
- Flip the feature flag in puppet for this batch
- Run puppet
- Rinse, repeat