During the Patch Demo incident on 2026-01-27 we had a hard time restarting k3s for two reasons:
- /var/lib/rancher/k3s/server/cred/passwd newer than datastore and could cause a cluster outage. Remove the file(s) from disk and restart to be recreated from datastore.
- Workaround rm /var/lib/rancher/k3s/server/cred/passwd
- http: TLS handshake error from 127.0.0.1:56198: remote error: tls: bad certificate
- Caused by expired certs in /var/lib/rancher/k3s/server/tls
- Workaround: rm -rf /var/lib/rancher/k3s/server/tls
- This didn't work, but maybe ... could: https://docs.k3s.io/cli/certificate#rotating-client-and-server-certificates
We should make it so we don't have to do workaround when restarting k3s.