Page MenuHomePhabricator

deployment charts: automate testing on staging
Open, LowPublic

Description

Deploying service charts using helmfile involves a step where the service is deployed to the staging cluster and tested. Instead of performing tests manually by running curl commands, services should provide a way to run a suite of end-to-end tests that ensure the service is functioning as intended an can be deployed to production.

Ideally, services would share a test framework and expose the test suite in a uniform way. Also, it should be easy to run the tests locally in a development environment (e.g. on minikube).

Proposal:

  • Create a framework based on python's unittest package, aka pyunit. This is available on deployment hosts.
  • Use make as a runner to provide a uniform way of invoking all kinds of tests and for additional setup steps,
  • Place a makefile in a tests foler in the service directory and define a check target that invokes the tests.
  • Allow the makefile to function in at least two environments, "staging" (the default) and "minikube" (for local testing)
  • Define "minikube" as an environment in helmfile.yaml as well, wiuth a single deployment called "local".
  • Place any test files in the tests directory. Normally these would be pyunit tests.
  • Place reusable make snippets in a makefiles directory at the top level of the repo.
  • Place reusable python packages in a pathon directory at the top level of the repo.
  • Create a python module that can read helm value files, so tests can be written to check he bhavios based on what is in the value files.
  • It should be possible to define local overrides in alue files anding in .local.yaml.

Event Timeline

Change #1219222 had a related patch set uploaded (by Daniel Kinzler; author: Daniel Kinzler):

[operations/deployment-charts@master] rest-gateway: improve structure of end-to-end tests

https://gerrit.wikimedia.org/r/1219222

FWIW there is the concept of helm test (https://helm.sh/docs/topics/chart_tests/) that is totally unused for most of our services although we do create a test based on service-checker by default for all new charts: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/_scaffold/service/_skel/templates/tests/test-service-checker.yaml

The obvious benefit here is that the tests itself are bundled withing a container (or the same as the service) and that there is no external tooling required to make them work. We could even make helm/helmfile run the tests automatically on every deploy, removing the need of manual interaction.

FWIW there is the concept of helm test (https://helm.sh/docs/topics/chart_tests/) that is totally unused for most of our services although we do create a test based on service-checker by default for all new charts: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/deployment-charts/+/refs/heads/master/_scaffold/service/_skel/templates/tests/test-service-checker.yaml

The obvious benefit here is that the tests itself are bundled withing a container (or the same as the service) and that there is no external tooling required to make them work. We could even make helm/helmfile run the tests automatically on every deploy, removing the need of manual interaction.

I looked at it briefly but I had assumed it was chart-scoped and not service-scoped so it couldn't easily check a running deployment, but it would seem I was mistaken.

Change #1219222 merged by jenkins-bot:

[operations/deployment-charts@master] rest-gateway: improve structure of end-to-end tests

https://gerrit.wikimedia.org/r/1219222

MLechvien-WMF moved this task from Scheduled (this Q) to Backlog on the ServiceOps new board.
MLechvien-WMF subscribed.

@Clement_Goubert this is in Scheduled work but I don't think we'll have capacity to land it this quarter, so moving it to Backlog