Umbrella task to track the work required towards upgrading our Kubernetes clusters to Kubernetes >1.25 (1.24 is EOL on 2023-07-28).
We're currently running 1.23 which went EOL on 2023-02-08 and there are some bigger requirement to be dealt with before moving to a newer version:
- We need to migrate away from docker ad container runtime: T269684
- We need to migrate away from PodSecurityPolicies: T273507
Together with the Kubernetes update, we need to update the following other components:
- Calico
- Istio
- cert-manager
- kserve
- knative-serving
- coredns
- helm
Preparation for the Kubernetes update
- Ensure all our charts are compatible with the new Kubernetes version (currently validating against 1.27)
- Read Kubernetes changelogs (yellow/red flags just linked below each version. Tick the box if all action required items have been addressed, use ✅ for single items), https://relnotes.k8s.io
- v1.24
- Action Required
- Note
- v1.25
- Action Required
- Note
Upgrade process
- Package Kubernetes
- Package Calico / update helm chart
- Re-initialize wikikube-staging-codfw
- Re-initialize wikikube-staging-eqiad
- Update grafana dashboards and alerts (to find dashboards using a specific metric, see https://wikitech.wikimedia.org/wiki/Grafana#Search/audit_metrics_usage_across_dashboards)
- Re-initialize wikikube-codfw
- Re-initialize wikikube-eqiad