We're currently running k8s nodes with cgroup v2 disabled (systemd.unified_cgroup_hierarchy=0) as kubelet 1.16 only supports v1. We're also running docker with cgroupfs which is said to cause instability under pressure.
With the k8s 1.23 update we should:
- Enable cgroup v2: systemd.unified_cgroup_hierarchy=1
- Docker supports cgroup v2 since Docker 20.10. Running Docker on cgroup v2 also requires the following conditions to be satisfied:
- containerd: v1.4 or later
- runc: v1.0.0-rc91 or later
- Kernel: v4.15 or later (v5.2 or later is recommended)
- Switch the docker cgroup driver to systemd: native.cgroupdriver=systemd (https://gerrit.wikimedia.org/r/c/operations/puppet/+/524186)
- Ensure kubelet runs with cgroup driver systemd as well: --cgroup-driver=systemd
https://v1-23.docs.kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers
https://v1-23.docs.kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#configuring-a-cgroup-driver
See also: T277876: Reserve resources for system daemons on kubernetes nodes