PAWS Kubernetes needs to be upgraded as it is well outside of supported releases at this point.
Co-opting this task to introduce a full redesign of the PAWS Kubernetes layer to work similar to the build of Toolforge Kubernetes.
Following that model the resulting cluster will be
- Highly available
- Debian Buster
- Much more secure
- Hopefully use a normal k8s ingress like Toolforge does
- Be understood and supportable for WMCS
- Largely puppetized T188912
- kubeadm controlled
Scope notes
- This will likely include moving PAWS to its own project. However, that will require paws-public to be built out as a separate webservice in k8s rather than a tool, which should be fairly simple.
- A modified mechanism for user management is needed to distribute k8s credentials to admins and renew the certs. Something like a new argument to just do admin users in maintain-kubeusers would do it.
- This will be tied to the Toolforge k8s upgrade cycle either way because of package management and WMCS support capabilities
Steps
- Refactor kubeadm base modules from toolforge::kubeadm to just kubeadm T251297
- T246122: Upgrade the Toolforge Kubernetes cluster to v1.16 Otherwise, this will need rework right after.
- Sort out naming issues around wmflabs.org, wmcloud.org etc. T251295
- Build a cluster in paws project
- Add the puppet modules for the new servers (T188912)
- T253241: Add helm3 to the component repo
- etcd servers and haproxy
- T251298: Design the resource limits, RBAC and PSP needed for the PAWS Kubernetes cluster
- sort out helm3 issues
- finish the NFS setup T160113: Move PAWS nfs onto its own share
- certs (LetsEncrypt, etc.) T218157 -- went with acme-chief!
- Ingress build T195217: Simplify ingress methods for PAWS
- Deploy user-manager service (likely maintain-kubeusers with new admins feature) just for admin accounts T246059
- Get the paws-public material working on the new cluster T255997
- QA the new cluster before full migration
- Improve the documentation, which is scattershot and difficult to follow T253761
- Switch over DNS to the new ingress or change what the proxy points to, depending on the final setup.
- Document the administration of the cluster at https://wikitech.wikimedia.org/wiki/PAWS/Admin
- Clean up the old paws cluster in the tools project