Page MenuHomePhabricator

Upgrade ML clusters to kubernetes 1.31
Open, Needs TriagePublic

Description

The clusters are currently running Kubernetes 1.23 which went EOL on 2023-02-08. Kubernetes 1.31 is running in production (wikikube) since July 2025.
It's time to upgrade since we plan on moving to the next (>1.31) Kubernetes version already and we aim to support not more then two versions at the same time.

https://wikitech.wikimedia.org/wiki/Kubernetes/Clusters/Upgrade/1.31

Event Timeline

@DPogorzelski-WMF https://os-reports.wikimedia.org/bullseye.html reports that the etcd staging and prod vms for ML k8s are running Bullseye, we'd need to reimage them before the end of Q4 with at least Bookworm (ideally Trixie). It is probably something good to couple with the K8s upgrade, it will take a little more but way less risky. Lemme know!