Page MenuHomePhabricator

Reimaging a kubernetes control-plane invalidates service-account tokens issued by it
Closed, ResolvedPublic

Description

T379968 revealed a big flaw in the system we use to distribute the public keys used to validate service-account tokens of "other" control planes (T329826: Kubernetes v1.23 use PKI for service-account signing (instead of cergen)).

The way this currently works is:

  • A service-account (sa) cert and key is issued via PKI (puppet) on a control-plane (wikikube-ctrl1001.eqiad.wmnet)
  • Puppet triggers the kube-publish-sa-cert systemd service which stores the public key of the sa certificate in etcd (as /kube-apiserver-sa-certs/wikikube-ctrl1001.eqiad.wmnet)
  • All control-planes watch for changes at /kube-apiserver-sa-certs/ and, on change, dump all public keys which don't match their FQDN into /etc/kubernetes/pki/kube-apiserver-sa-certs.pem, reloading the kub-apiserver afterwards via systemctl restart kube-apiserver-safe-restart.service

Reimaging a control-plane leads to it's service-account cert and key being created from scratch (new filesystem, new key) and that in turn overrides the sa public key stored in etcd with the new one, practically rendering all service-account tokens that have been issued with the old key (pre reimage) unusable as they can no longer be validated.

A (not very glamorous) idea to circumvent this would be to no longer use the FQDN as key in etcd but the fingerprint. That would ensure the old public key does stick around (as it will not be overridden). I don't think filtering out the "own" public key is strictly required.
In addition we would have to make sure that old certificates are cleaned up from etcd. This could be done by extending kube-publish-sa-cert to iterate over all certificates and remove those that are expired:

export ETCDCTL_API=3 
etcdctl --endpoints "https://$(hostname -f):2379" get --prefix /kube-apiserver-sa-certs/ --keys-only | sed '/^$/d' | \
  while read key; do
    etcdctl --endpoints "https://$(hostname -f):2379" get "$key" | \
      openssl x509 -in /dev/stdin -checkend 1 -noout >/dev/null || etcdctl --endpoints "https://$(hostname -f):2379" del "$key";
  done

Event Timeline

jijiki triaged this task as High priority.Nov 18 2024, 10:30 AM
jijiki moved this task from Incoming 🐫 to this.quarter 🍕 on the serviceops board.

+1 to fingerprint as the key of keys

+1 to blocking k8s control plane reimages until we re-key by fingerprint

Change #1092803 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] kubernetes::master: Don't override sa certificates on reimage

https://gerrit.wikimedia.org/r/1092803

In T380142, @JMeybohm wrote:

I don't think filtering out the "own" public key is strictly required.

kube-apiserver does not complain about the own public key or duplicate keys listed in the service-account-key-file, so I think we're good to go.

Mentioned in SAL (#wikimedia-operations) [2024-11-20T08:51:18Z] <jayme> disabling puppet on all k8s controll planes for rollout of T380142

Change #1092803 merged by JMeybohm:

[operations/puppet@production] kubernetes::master: Don't override sa certificates on reimage

https://gerrit.wikimedia.org/r/1092803

Change #1093280 had a related patch set uploaded (by JMeybohm; author: JMeybohm):

[operations/puppet@production] Fix permissions and notify of kube-publish-sa-cert

https://gerrit.wikimedia.org/r/1093280

Change #1093280 merged by JMeybohm:

[operations/puppet@production] Fix permissions and notify of kube-publish-sa-cert

https://gerrit.wikimedia.org/r/1093280

Mentioned in SAL (#wikimedia-operations) [2024-11-20T10:33:21Z] <jayme> re-enabled puppet on all k8s controll planes for rollout of T380142

JMeybohm claimed this task.

The change has been rolled out to all control-planes. I'll let the old keys (by FQDN) stay in etcd until they get cleaned up automatically to avoid unnecessary apiserver restarts.