This is a proposal to migrate our Superset services to the DSE kubernetes cluster.
Two instances of Superset are within scope:
Currently:
- superset is hosted on bare-metal on a single host: an-tool1010.eqiad.wmnet
- superset-next is hosted on a VM on on a single host: an-tool1005.eqiad.wmnet
They each use a discrete database on analytics_meta MariaDB database on an-coord1001 for storing state.
They use have an instance of memcached local to the host, which is used for various metadata caching, but not query results caching.
The purpose of this ticket is to try to achieve consensus on the benefits, costs, and potential risks of moving Superset to the DSE Kubernetes cluster.
At this stage, we believe that the following steps will be required:
- Write a lightweight design document describing how the Superset services are intened to work on Kubernetes T349396
- Create a Superset container image using GitLab-CI and the Blubber/Kokkuri framework. T352165
- Apply our patches not yet merged upstream to the supserset codebase in our Docker image - T356477
- Create a helm chart for Superset T352166
- Evaluate the upstream chart and if appropriate
use our policy review to decide whether or not we should use it. or: - Create a new helm chart for Superset using https://gitlab.wikimedia.org/repos/sre/sextant
- Evaluate the upstream chart and if appropriate
- Create two namespaces for superset and superset-next: https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/983718
- Add kubeadm config files for the two new namespaces: https://gerrit.wikimedia.org/r/c/operations/puppet/+/983720
- Create two helmfile deployment for superset and superset-next T353790
- Ensure that we are running an up-to-date version of Superset, facilitating the migration T335356
- Ensure necessary firewall rules are open between the DSE worker nodes and external services - T356623
- Create a keytab for each Superset deployment and make this available to the pods
- Make configuration secrets available to helmfile - T356480
- Configure ingress internal DNS records - T356481
- Add entries to the puppet service catalog - T356483
- Update public domain DNS records to make them point to the DSE Kubernetes ingress - T356482
- Configure OIDC authentication for superset on dse-k8s - T353794
- Write a migration plan for Superset to K8S - including what to do about the legacy instances.
- Monitor the availability of the superset deployments - T356484
- Create saved views for the superset deployment logs - T356485
- Update the wikitech page with our production readiness checklist - T356486
- Find a solution for the requestctl-generator html page - T356490
- Serve Superset static assets from an optimised container - T357890
n.b. At present, we are not planning to move the metadata database (which is MariaDB in our case) to Kubernetetes.
The upstream helm charts declare a dependence on postgresql, which is what they tend to use with persistent volume claims, but for now we are not planning to use this.
We do have an option to migrate to PostgreSQL running on an-db100[1-2] (which is what Airflow uses) but we are not necessarily planning to take this option either. We have decided to stick with MariaDB for now.