Similar to T316866: Decision Request - Openstack Upgrade Cadence it would be useful to have a dependable schedule to plan upgrades and keep the Toolforge kubernetes cluster up to date. This is not a decision request, as I am not personally proposing a cadence yet.
Some notes to consider for this discussion:
- Kubernetes currently releases on a 4 month cycle (https://kubernetes.io/releases/release/). This means 3 releases a year, occurring generally around May, September and December. Release branches for the most recent three minor releases are maintained, providing ~12-14 months of patch support
- Latest version of k8s is 1.26 released on 2022-12-09
- Toolforge k8s is running on 1.21 (EOL on 2022-06-28), almost ready for 1.22 (EOL on 2022-12-08) (thanks @taavi !)
- We cannot skip versions when upgrading (Per @taavi Kubernetes upstream does not support skipping versions when upgrading an existing cluster, we cannot currently redeploy a cluster and must in-place upgrade)
- From CNCF user group, most major hosting services are upgrading twice a year, and skipping a version each time. Trying to be a bit slower. Most are running 1.22/1.23 at this time. The production k8s cluster is v1.16.15, upgrading to 1.23 ongoing.
Given this, assuming we want to run a supported version of kubernetes, we would need to upgrade at least once per year (faster than ~12-14 months of support). However, if we cannot or won't skip large versions, it's likely we'll need to upgrade on an even faster cadence to stay on a supported release.
Lastly note that upgrading has been a historical issue for k8s cluster operators. See https://github.com/kubernetes/enhancements/blob/master/keps/sig-release/1498-kubernetes-yearly-support-period/README.md. "The survey conducted in early 2019 by the WG LTS showed that a significant subset of Kubernetes end-users fail to upgrade within the 9-month support period...This, and other responses from the survey, suggest that this 30% of users would better be able to keep their deployments on supported versions if the patch support period were extended to 12-14 months. This appears to be true regardless of whether the users are on DIY build or commercially vendored distributions. An extension would thus lead to more than 80% of users being on supported versions, instead of the 50-60% we have now."