Page MenuHomePhabricator

[infra,k8s] remove deprecated kubelet flags before 1.28 upgrade (we might be able to remove all custom ones)
Closed, ResolvedPublic

Description

iirc, some are about to be deprecated

Jul 11 12:51:20 toolsbeta-test-k8s-worker-nfs-1 kubelet[3781084]: Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote'
Jul 11 12:51:20 toolsbeta-test-k8s-worker-nfs-1 kubelet[3781084]: Flag --pod-infra-container-image has been deprecated, will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.
root     3781084  3.3  1.5 2086436 123288 ?      Ssl  Jul11 387:27 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --container-runtime-endpoint=unix:///run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.6 --node-labels=toolforge.org/nfs-mounted=true,kubernetes.wmcloud.org/nfs-mounted=true

Event Timeline

dcaro triaged this task as High priority.Jul 18 2024, 4:57 PM
dcaro moved this task from Backlog to Ready to be worked on on the Toolforge board.
Slst2020 changed the task status from Open to In Progress.Jul 26 2024, 3:31 PM
Slst2020 claimed this task.
Slst2020 moved this task from Next Up to In Progress on the Toolforge (Toolforge iteration 13) board.

Both --container-runtime and --pod-infra-container-image are set in /var/lib/kubelet/kubeadm-flags.env. This file is not managed by puppet, but is used by kubeadm (init and join) to dynamically set kubelet flags without directly modifying the main kubelet systemd service file.

root@toolsbeta-test-k8s-worker-nfs-1:~# cat /var/lib/kubelet/kubeadm-flags.env
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=unix:///run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.6"

Then, the kubelet systemd service file includes a line that sources this file:

root@toolsbeta-test-k8s-worker-nfs-1:~# systemctl cat kubelet
# /lib/systemd/system/kubelet.service
...
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
...

The three different flags contained in /var/lib/kubelet/kubeadm-flags.env are all being deprecated (see kubelet docs for details):

--container-runtime has only one possible value (remote) and can be removed
--pod-infra-container-image should be set in /etc/containerd/config.toml
--container-runtime-endpoint should be set via the config file specified by the kubelet's --config flag

With that, /var/lib/kubelet/kubeadm-flags.env, could (probably) be removed altogether.

kubeadm does explicitly not support automated ways of reconfiguring components that were deployed on managed nodes (see https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-reconfigure/) https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-reconfigure/#applying-kubelet-configuration-changes describes how to apply kubelet config changes.

tl;dr

  1. Edit the kubelet-config ConfigMap
  2. Log in to a kubeadm node
  3. Run kubeadm upgrade node phase kubelet-config to download the latest kubelet-config ConfigMap contents into the local file` /var/lib/kubelet/config.yaml`
  4. Edit the file /var/lib/kubelet/kubeadm-flags.env to apply additional configuration with flags
  5. Restart the kubelet service with systemctl restart kubelet

This needs to be performed on each node.

To recap, what needs to happen for this change is:

  1. Edit the kubelet-config ConfigMap to include the containerRuntimeEndpoint and roll out this change to all nodes as described in the comment above. This will update /var/lib/kubelet/config.yaml on each node. Strangely though, this file is missing from some nodes. i.e. toolsbeta-test-k8s-control-8 has it, but toolsbeta-test-k8s-control-7 doesn't.
  2. Remove /var/lib/kubelet/kubeadm-flags.env from all the nodes

/etc/containerd/config.toml already has the pause image configured, although we might want to eventually update it to a newer version:

root@toolsbeta-test-k8s-control-8:~# cat /etc/containerd/config.toml | grep sandbox
    sandbox_image = "docker-registry.tools.wmflabs.org/pause:3.1"
Slst2020 renamed this task from [infra,k8s] review kubelet flags before 1.26 upgrade to [infra,k8s] remove deprecated kubelet flags before 1.27 upgrade.Jul 31 2024, 6:52 AM

@dcaro not sure where to go from here. It seems we currently don't have an automated way to roll out cluster-wide config changes.

@dcaro not sure where to go from here. It seems we currently don't have an automated way to roll out cluster-wide config changes.

There's two options:

  • Using puppet -> the current cluster-wide config management, for continuous compliance, git versioned history
  • Using cookbooks/cumin -> for one-off changes, only fixes things when you manually run it, only sal log trace

I propose:

  • Edit the kubelet-config ConfigMap manually
  • Run kubeadm upgrade node phase kubelet-config with cumin (start with one manually, then run on all with cumin)
  • Remove the kubeadm-flags.env file with puppet, adding a note there on why we are removing it (probably linking to this task)

Trying that on toolsbeta first to make sure everything works.

I tried this manually on my kubeadm test cluster (1.28) and it worked fine. However, in Toolsbeta we are getting this error:

root@toolsbeta-test-k8s-worker-10:~# kubeadm upgrade node phase kubelet-config
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0731 08:20:51.677885 1090906 configset.go:177] error unmarshaling configuration schema.GroupVersionKind{Group:"kubelet.config.k8s.io", Version:"v1beta1", Kind:"KubeletConfiguration"}: strict decoding error: unknown field "containerRuntimeEndpoint"
...

Both my test cluster and toolsbeta use kubelet.config.k8s.io/v1beta1, where containerRuntimeEndpoint is defined. However, I found this in the 1.27 release notes:

kubelet: migrated --container-runtime-endpoint and --image-service-endpoint to kubelet config (#112136, @pacoxu)

So maybe this isn't possible before 1.27, in which case we need to keep kubeadm-flags.env, but delete the two other deprecated flags from it.

Another option would be to move --container-runtime-endpoint to /etc/default/kubelet for now, which would let us delete the kubeadm-flags.env file

Slst2020 changed the task status from In Progress to Open.Aug 28 2024, 8:47 AM
Slst2020 moved this task from Toolforge iteration 14 to Ready to be worked on on the Toolforge board.
Slst2020 edited projects, added Toolforge; removed Toolforge (Toolforge iteration 14).

Summary of today's deep dive:

dcaro renamed this task from [infra,k8s] remove deprecated kubelet flags before 1.27 upgrade to [infra,k8s] remove deprecated kubelet flags before 1.28 upgrade (we might be able to remove all custom ones).Sep 30 2024, 1:52 PM

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T14:50:09Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_node for a worker-nfs role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T14:50:34Z] <raymond-ndibe@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=99) for a worker-nfs role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T14:51:13Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T14:51:38Z] <raymond-ndibe@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=99) for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T14:56:25Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T14:56:51Z] <raymond-ndibe@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=99) for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T14:58:05Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T15:05:39Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T15:06:01Z] <raymond-ndibe@cloudcumin1001> END (FAIL) - Cookbook wmcs.toolforge.add_k8s_node (exit_code=99) for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T16:11:05Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the toolsbeta cluster (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-21T16:21:01Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.add_k8s_node for a worker role in the toolsbeta cluster (T370245)

Change #1113194 had a related patch set uploaded (by Raymond Ndibe; author: Raymond Ndibe):

[operations/puppet@production] [wmcs::kubeadm::core] remove kubeadm-flags.env

https://gerrit.wikimedia.org/r/1113194

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-23T20:10:43Z] <raymond-ndibe@cloudcumin1001> START - Cookbook wmcs.toolforge.k8s.reboot for all NFS workers (T370245)

Mentioned in SAL (#wikimedia-cloud-feed) [2025-01-23T20:43:23Z] <raymond-ndibe@cloudcumin1001> END (PASS) - Cookbook wmcs.toolforge.k8s.reboot (exit_code=0) for all nodes (T370245)

Raymond_Ndibe changed the task status from Open to In Progress.Jan 24 2025, 1:37 PM

Change #1113194 merged by FNegri:

[operations/puppet@production] [wmcs::kubeadm::core] remove kubeadm-flags.env

https://gerrit.wikimedia.org/r/1113194

I merged the patch above, and Puppet removed the file /var/lib/kubelet/kubeadm-flags.env across the cluster:

fnegri@tools-k8s-worker-nfs-43:~$ sudo journalctl -u puppet-agent-timer -g flags.env
Feb 19 15:20:22 tools-k8s-worker-nfs-43 puppet-agent[2393188]: (/Stage[main]/Profile::Wmcs::Kubeadm::Core/File[/var/lib/kubelet/kubeadm-flags.env]/ensure) removed

On a worker that was restarted after the change (tools-k8s-worker-nfs-55), I can see that kubelet is now running without the deprecated options that were coming from that file:

fnegri@tools-k8s-worker-nfs-55:~$ ps ax  |grep /usr/bin/kubelet
    864 ?        Ssl   95:14 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --node-labels=toolforge.org/nfs-mounted=true,kubernetes.wmcloud.org/nfs-mounted=true

For comparison, on a different worker that hasn't been restarted yet (tools-k8s-worker-nfs-43) the deprecated options are still present:

fnegri@tools-k8s-worker-nfs-43:~$  ps ax  |grep /usr/bin/kubelet
    859 ?        Ssl  2620:32 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///run/containerd/containerd.sock --pod-infra-container-image=registry.k8s.io/pause:3.6 --node-labels=toolforge.org/nfs-mounted=true,kubernetes.wmcloud.org/nfs-mounted=true

We're gonna restart all kubelets soon as part of T362868: [infra,k8s] Upgrade Toolforge Kubernetes to version 1.29.

There is an issue here. I just checked and containerRuntimeEndpoint: unix:///run/containerd/containerd.sock was not added to kubelet-config configmap for tools before this was merged, we only did that for toolsbeta. I compared the configmaps on tools and toolsbeta. (would have broken something if the nodes were restarted afterwards, hopefully it wasn't). Let me add that before someone gets the idea to restart the nodes

There is an issue here. I just checked and containerRuntimeEndpoint: unix:///run/containerd/containerd.sock was not added to kubelet-config configmap for tools before this was merged, we only did that for toolsbeta. I compared the configmaps on tools and toolsbeta. (would have broken something if the nodes were restarted afterwards, hopefully it wasn't). Let me add that before someone gets the idea to restart the nodes

Done

Thanks @Raymond_Ndibe for checking and fixing this!

A couple of nodes were rebooted after this was merged (tools-k8s-worker-nfs-55 and tools-k8s-worker-nfs-37), and for some reason they seemed to work fine even without that setting.