Page MenuHomePhabricator

toolforge k8s: some static pods needs manual restart
Closed, ResolvedPublic

Description

A common scenario is a newly created kubernetes control node, in which there is a small race condition in the initial startup of the static pods (kube-apiserver, kube-controller-manager, kube-scheduler), in which we need to manually restart the last 2.

Other scenario is a kubeadm cert renewal, in which to make sure some of the static pods are updated for real with the newer certs, we need to manually restart them.

Restarting a static pod is a matter of:

  • moving away the corresponding /etc/kubernetes/manifest/<file>A rename with a heading . (dot) is enough.
  • waiting for kubelet to detect the missing file, for it to kill the pod
  • put again the file in place
  • wait for kubelet to detect the file and create the static pod again

Event Timeline

aborrero triaged this task as Medium priority.Feb 26 2024, 11:47 AM

Change 1006529 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/wmcs-cookbooks@main] kubernetes: refactor static pod restart logic

https://gerrit.wikimedia.org/r/1006529

Change 1007604 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[cloud/wmcs-cookbooks@main] toolforge: add restart-static-pods cookbook

https://gerrit.wikimedia.org/r/1007604

Change 1006529 merged by jenkins-bot:

[cloud/wmcs-cookbooks@main] kubernetes: refactor static pod restart logic

https://gerrit.wikimedia.org/r/1006529

Change 1007604 merged by jenkins-bot:

[cloud/wmcs-cookbooks@main] toolforge: add restart-static-pods cookbook

https://gerrit.wikimedia.org/r/1007604

aborrero claimed this task.