Page MenuHomePhabricator

TF code to test all the things
Closed, ResolvedPublic

Description

Make some terraform code that will deploy one of all of the things we deploy
vm, trove, magnum, magnum template, etc
https://registry.terraform.io/providers/terraform-provider-openstack/openstack/latest/docs
^ all the things

This is to be used to test that we actually can deploy one of everything that we support.

Event Timeline

This is setup in https://github.com/toolforge/testlab-terraform and can be deployed from a VM in testlabs (with terraform installed https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli). If anyone wants the git-crypt key please let me know.

It currently is, partially, setup to run in both eqiad1 and codfw1dev (It's not entirely clear to me that we are going to run it in both envs, so only some work has been put in on variablizing the code to allow for it. If we want it in both I'll happily update to have both). A little more code would be needed for some vm lookups. It currently cannot deploy a magnum cluster (T333874) or a trove db (T337882).

We probably want it to deploy, or perhaps do post deploy kinds of things (Such as attaching a volume to a vm) beyond what is listed in the current code. Though I will need some guidance on what we want to be deploying and testing before I can add more things.

Of greater note, we should put this in use first. Until we have that part we have little.

I'm curious why using github (within the toolforge namespace) for this code? I would have expected this to go into https://gitlab.wikimedia.org/repos/cloud/cloud-vps which already contains https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps

Related to that, https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps contains the resource to create Cloud VPS specific things like webproxy and puppet stuff. You may want to play with that as well, but I'm not sure if that would be available in codfw1dev (it should!)

I'm curious why using github (within the toolforge namespace) for this code? I would have expected this to go into https://gitlab.wikimedia.org/repos/cloud/cloud-vps which already contains https://gitlab.wikimedia.org/repos/cloud/cloud-vps/terraform-cloudvps

This project is separate from the terraform-cloudvps project. terraform-cloudvps is the code for the terraform provider for cloud-vps specific things, as you mentioned. This project is for testing that we deploy the things we expect to be able to cloud-vps, it is using terraform, but that is only because terraform offers us an easy method to deploy all the infrastructure and identify any parts that do not deploy. This could be part of a large project that goes on to do other testing, though it is not part of the terraform provider.

This project is separate from the terraform-cloudvps project. terraform-cloudvps is the code for the terraform provider for cloud-vps specific things, as you mentioned. This project is for testing that we deploy the things we expect to be able to cloud-vps, it is using terraform, but that is only because terraform offers us an easy method to deploy all the infrastructure and identify any parts that do not deploy. This could be part of a large project that goes on to do other testing, though it is not part of the terraform provider.

Cool.

The original questions remains though. Why the toolforge github namespace and not something like https://gitlab.wikimedia.org/repos/cloud/cloud-vps/testlabs-terraform

The original questions remains though. Why the toolforge github namespace and not something like https://gitlab.wikimedia.org/repos/cloud/cloud-vps/testlabs-terraform

Our deploy of gitlab cannot reasonably build a docker container. This is a project that could do well being integrated into k8s as a container.

Our deploy of gitlab cannot reasonably build a docker container. This is a project that could do well being integrated into k8s as a container.

See T336130: Automatically build Toolforge infrastructure container images in GitLab

I was pointed in that direction before. At the time I could not meaningfully get it to work because I could not pull base images iirc. Sounds like that restriction is lifted? Regardless deviating from what most of the world does in this space to use a special internal thing is rarely a reasonable choice.

I was pointed in that direction before. At the time I could not meaningfully get it to work because I could not pull base images iirc. Sounds like that restriction is lifted?

The restriction is still there, we can though pull images from any registry, only our own registries (tools an wikimedia), that allows us though to push the images we need there. This reminds me that I have to check if the harbor image repos are in the allowed list...

Though why do you need an image for the terraform manifests? (curious, I have not played with terraform much).

Regardless deviating from what most of the world does in this space to use a special internal thing is rarely a reasonable choice.

That depends on your definition of "most of the world", "internal" and "special" xd
The benefits from following the same standards that the rest of the organization follows might counter the lack of "worldwide" support.

Though why do you need an image for the terraform manifests? (curious, I have not played with terraform much).

Terraform would need to be installed into the container. That would normally be an upstream ubuntu container or the like.

Regardless deviating from what most of the world does in this space to use a special internal thing is rarely a reasonable choice.

That depends on your definition of "most of the world", "internal" and "special" xd
The benefits from following the same standards that the rest of the organization follows might counter the lack of "worldwide" support.

Our standards tend to involve manually deploying things, and rejecting the view that software has an end of life, rather we keep things running ad-nauseum, and ignore that what we do as engineers is restricted by bringing ordinary processes (Like container generation) into internal methods. The result is that we no longer have access to the great depth of documentation and experience, instead we have "Oh yeah, Joe in the foo department might know about that." When working in tech, one should do what everyone else is doing, or else they are probably doing it wrong. Indeed, I would ascribe every mess that we have in WMCS is firmly rooted in deviation from the generic approach that most of the world takes.

At this point I'm going to shut this conversation down. I asked for input on what bits of infrastructure should be included in the terraform code to test our infrastructure. Not a debate on the subject of my philosophical views of how to interact with software.

I will re-iterate, to anyone interested please post any particular bits of infrastructure that you would like to see tested by terraform.

Our standards tend to involve manually deploying things, and rejecting the view that software has an end of life, rather we keep things running ad-nauseum, and ignore that what we do as engineers is restricted by bringing ordinary processes (Like container generation) into internal methods. The result is that we no longer have access to the great depth of documentation and experience, instead we have "Oh yeah, Joe in the foo department might know about that." When working in tech, one should do what everyone else is doing, or else they are probably doing it wrong. Indeed, I would ascribe every mess that we have in WMCS is firmly rooted in deviation from the generic approach that most of the world takes.

Let's change that :), let's automate things on gitlab! Let's iterate on how to do that! Let's not give up.

I will re-iterate, to anyone interested please post any particular bits of infrastructure that you would like to see tested by terraform.

I'd say anything we offer no? (not sure what's possible and what's not though)

  • trove db
  • cinder volume
  • proxy entry
  • floating ip
  • nfs mounts?
  • puppet enc project settings (class and hiera)
  • puppet enc prefix settings (class and hiera)
  • puppet enc VM settings (class and hiera)
  • custom puppetmaster?

those come to mind as things we offer to users in cloudvps

The terraform now can test the following in eqiad1:
VM deploy
Volume deploy
Volume attachment to VM
Trove (MySQL) deploy
Trove (Mariadb) deploy
Trove (Postgresql) deploy
Floating IP allocation
Floating IP attachment to VM
Magnum cluster deploy # known not to work T333874
Magnum cluster template deploy
Security group deploy
Security group attachment to VM
Prefix puppet hiera deploy
Web proxy deploy

It has some code, largely inherited from paws, to allow it to work in codfw1dev, though codfw1dev isn't very accessible at the moment so it can't be tested, and is incomplete. Can be updated when desired.

T338636 has been opened to track getting this running/alerting somewhere.

I'd say anything we offer no? (not sure what's possible and what's not though)

As a rule of thumb if you can do it in horizon, you can do it in terraform. Things we've added to horizon are on us, I believe web proxy and prefix puppet are supported by the provider that @taavi created. If it is something that you would log into the server to setup/test it is basically beyond terraform. I say basically because you can have terraform log into a server, but it is very much so out of its element there. Ansible is a far more effective tool once you can log into a server to do things. As such if we wanted command line setup/testing we would want to do that in something other than terraform.