As we're moving more services through the Deployment Pipeline to production, beta is beginning to suffer.
There are several proposed solutions; Let's see if using an existing hosted k8s solution is viable.
== Problems ==
1. The Deployment Pipeline is currently unable to perform system tests that incorporate both a change to a service and an existing MediaWiki installation; It is limited to e2e testing only the service itself.
2. A k8s cluster that integrates with Beta Cluster (as in has secure network ingress/egress between deployed pods/services and existing deployment-prep instances) would allow the Deployment Pipeline to perform this kind of testing. However, at this time neither SRE nor RelEng can commit to maintaining an in-house k8s cluster for this purpose.
== Proposal ==
Experiment with [third party k8s provider] to evaluate its potential as a third-party hosted k8s cluster that can:
1. Provide a k8s cluster that the Deployment Pipeline can target as part of its graduated deployment/testing strategy.
2. Securely integrate with Beta Cluster at a network level.
3. Run e2e helm tests that exercise service changes and existing MediaWiki deployments in Beta Cluster together.
== Evaluation ==
A very basic test for teasing out of any third party's k8s viability will be:
1. Can our existing Mathoid helm chart be used to deploy there?
2. If not, how much refactoring would the chart(s) need? More precisely, can we make them work with both [third party k8s] and WMF k8s without too much divergence?