== Problem ==
We have a number of things that deploy into kubernetes, for example:
* toolforge custom admission controllers (currently 3 and growing)
* toolforge maintain_kubeusers
* toolforge jobs framework components (currently 2, api and emailer)
* toolforge components deployed from operations/puppet.git such as the ingress setup and other pieces
* paws stuff
(and potentially more that I'm overlooking at the moment).
NOTE: we have a number of kubernetes clusters maintained by WMCS: tools, toolsbeta, paws, and potentially more in the future. This request covers all software components for k8s clusters maintained by WMCS.
Each of the items listed above has a different deployment code pattern. For example:
* a `deploy.sh` script with some logic inside it
* a `kustomize`-based setup
* a `helm`-based setup
* a raw `kubectl apply` call
* some combination of all of the above
For a number of reasons, there is no written agreement on which deployment code pattern to use for a given repository.
=== Constraints and risks ===
Some additional notes.
==== certificates ====
Some components need x509 certificate generation, and/or other credential management. Ideally, the option we choose is valid to handle the required certificate/credential management.
==== deployment mechanism ====
We should perhaps consider this 'deployment code pattern' different from the 'deployment mechanism'.
Let 'deployment mechanism' be the way in which we trigger this deployment, at the moment the options are:
* 100% manual. A human runs a command on a server.
* somewhat automated: by means of a spicerack cookbook, puppet agent run, some other script, or whatever.
* CI/CD pipeline, for example for PAWS, which is currently based on github actions I believe.
Please note that the 'deployment code pattern' concept is independent of the 'deployment mechanism'.
We could automate helm, kustomize or whatever, once we decide which one to use.
Deciding on 'deployment mechanism' (or automation level/mode) is out of scope of this request.
Note however, that deciding on this very request will greatly benefit us later on when we start automating stuff.
==== new standard, who makes the changes? ====
If we introduce a new standard, we will need updates to several code repositories. That could be a lot of work.
The author of this request volunteers to do the work once the standard has been decided.
== Decision record ==
In progress
https://wikitech.wikimedia.org/wiki/Wikimedia_Cloud_Services_team/EnhancementProposals/Decision_record_T302593_How_do_we_make_decisions
== Options ==
=== Option 1 ===
Use heml https://helm.sh/
The proposed standard files are as follow:
```
topdir/
topdir/helmchart/
topdir/helmchart/values.yaml <--- base file
topdir/helmchart/values-toolsbeta.yaml <--- toolsbeta-specific overrides
topdir/helmchart/values-tools.yaml <--- toolforge-specific overrides
topdir/helmchart/values-paws.yaml <--- paws-specific overrides
topdir/helmchart/values-devel.yaml <--- additional, arbitrary overrides are allowed
```
(yes, some components deploy into the 3 environments, `maintain-kubeusers` is a good example)
Example of patch introducing this layout to one of our custom components:
https://gerrit.wikimedia.org/r/c/cloud/toolforge/jobs-framework-emailer/+/747107
Example of manual operations using helm:
```lang=shell-session
user@machine:~$ helm install --debug --dry-run app-name ./helmchart -f helmchart/values-toolsbeta.yaml
[..]
user@machine:~$ helm install --debug --dry-run app-name ./helmchart -f helmchart/values-tools.yaml
[..]
user@machine:~$ helm install app-name ./helmchart -f helmchart/values-toolsbeta.yaml
[..]
```
(but again, the deployment mechanism is not covered in this request)
Pros:
* Industry standard to deploy stuff in k8s.
* Standard within other SRE teams @ WMF.
* Has a concrete specification on how to layout a given directory.
Cons:
* Is a package manager, mostly aimed at "apps". Many of our components are not "apps", but simple pieces of codes that do something.
* not integrated by default in kubernetes (kubectl etc)
* a bit "more" noisy code than with kustomize.
* the concrete specification on how to layout a given directory could be handicap in some cases.
* some unknowns for x509 certificate generation & management.
=== Option 2 ===
Use kustomize https://kubernetes.io/docs/tasks/manage-kubernetes-objects/kustomization/
The propose directory tree layout is as follows:
```
topdir/
topdir/deployment/
topdir/deployment/base/ <--- the base yaml
topdir/deployment/tools/ <-- the toolforge-specific overrides
topdir/deployment/toolsbeta/ <-- the toolsbeta-specific overrides
topdir/deployment/paws/ <-- the paws-specific overrides
topdir/deployment/devel-whatever/ <-- additional overrides are allowed
```
Example of patch introducing this layout to one of our custom components:
https://gerrit.wikimedia.org/r/c/cloud/toolforge/jobs-framework-emailer/+/769694
Example of manual operations using kustomize:
```lang=shell-session
user@machine:~$ kubectl get -k deployment/toolsbeta
[..]
user@machine:~$ kubectl apply -k deployment/toolsbeta
[..]
user@machine:~$ kubectl diff -k deployment/toolsbeta
[..]
```
(but again, the deployment mechanism is not covered in this request)
Pros:
* Industry standard to deploy stuff in k8s.
* Integrated by default in kubernetes (via kubectl).
* simple and to the point.
Cons:
* kustomize is not a full fledged standardized ecosystem (e.g. a repository format itself) and would rely on us introducing a explicit layout.
* apparently less sugar & magic than helm (but do we need that?)
* some unknowns for x509 certificate generation & management.
=== Option 3 ===
Use whatever, but have a common entry point `./deploy.sh`.
This options assumes that each component has its particularities, and that we want to retain flexibility above all.
To achieve this, each k8s component will have an executable `./deploy.sh` file at the top level directory which will do all the magic. The magic can be helm, kustomize, or whatever, we don't care as long as it works. This executable script receives no input arguments (or should work out of the box with no input arguments).
Pros:
* Simple, efficient, flexible.
* Perhaps the most sensible way to handle x509 certificate generation (since we can have arbitrary logic here).
Cons:
* Less elegant maybe?
* Perhaps assuming we need the additional flexibility is overly defensive and this will bite us in the future.