Page MenuHomePhabricator

Create TF Code for wikibase.cloud
Closed, ResolvedPublic

Description

Terraform bit (Including creating the kubernetes cluster)

Also considering if we want 1 or 2 Terraform environments.

Also consider to what extent we want this to be DRY (e.g. refactoring code that is common to both prod and staging into modules)

Event Timeline

@Tarrow lead a session looking into this task, here are some (incomplete) notes.

We discussed three possible ways to tackle this task

  1. Copy the tf/env/prod folder to tf/env/staging and edit tf/env/prod as needed
    • this would mean taking on some technical debt (staging)
    • but would:
      • allow us to complete this task more quickly and unblocking other tasks
      • reduce the risk of falling down rabbit holes
      • increase our chance of achieving the sprint goal
    • IMO the technical debt is acceptable given that it would reduce risk and speed up completion; further refactoring into modules could be done at the end of the sprint if time or at a later date
  2. Refactor files in tf/env/prod to one big singular module to be used in staging and production
    • Wouldn't have to untangle the
    • Probably more difficult than #1
    • Risk of falling into rabbit holes
  3. Refactor everything into discrete modules to be used in staging and production
    • Similar to above but probably requires even more effort and risk of rabbit holes

We had a look at what doing #1 might be like

  • how to unpick things named prod when really they are staging (e.g. google_storage_buckets)?
  • staging and production will need to be unpicked in dns.tf
  • uptime.tf won't work as is for production, we probably just want to remove that from the production env (tf/env/prod) for now
  • we will need to rename variables named specifically for staging; e.g. recaptcha_v2_staging_site_key -> recaptcha_v2_site_key

These checkboxes could be added to the task description if confirmed to be part of the work needed to complete this task

Further comments

  • We want the staging and production terraform configuration to be reused as much as possible (so that staging and prod don't diverge too much) while also allowing us to EASILY change the configuration where needed (so that we can test things on staging before it is deployed to prod)
  • If we use modules (#2 or #3) we would need some form of versioning on the modules so that a change to a module for staging doesn't break production. We believe this is possible with terraform but not sure on the details. Another reason to avoid using modules at this time.
commit 8f753c23d5f022e2f8b3892bbab3df0aca9ae3d5 (HEAD -> main, origin/main, origin/HEAD)
Author: Thomas Arrow <thomas.arrow_ext@wikimedia.de>
Date:   Wed Jan 26 18:13:00 2022 +0000

    TF: Delete old state bucket

commit f0e08f5d609475356a867be844bcb8d0e1077f6e
Author: Thomas Arrow <thomas.arrow_ext@wikimedia.de>
Date:   Wed Jan 26 18:11:01 2022 +0000

    TF: allow deletion of old state bucket

commit 0f8b93f34f30f48dbc23796cfac24f038a5287e2 (tfProdToStaging)
Author: Thomas Arrow <thomas.arrow_ext@wikimedia.de>
Date:   Wed Jan 26 17:29:49 2022 +0000

    Remove old staging bucket

commit a23896a92cffdddb32312ee6e911ced195cebdce
Author: Thomas Arrow <thomas.arrow_ext@wikimedia.de>
Date:   Wed Jan 26 17:21:32 2022 +0000

    TF: switch staging state to staging bucket

commit 0deb3e1ac1eb9d4170b01a91551690ae74504635
Author: Thomas Arrow <thomas.arrow_ext@wikimedia.de>
Date:   Wed Jan 26 17:14:41 2022 +0000

    TF: add new staging state bucket

commit eac57daa211a631ebebff0e026f3f8105aa929ab
Author: Thomas Arrow <tarrow@users.noreply.github.com>
Date:   Wed Jan 26 17:12:14 2022 +0000

    TF: Rename prod to staging (#142)

All the prod TF resources have now been created following option 1 from Ollie's comment to avoid rabbit holes. The exception to this is the cert-manager secrets because the cert-manager k8s namespace doesn't exist until we run the helmfile stuff.

As part of this we also left DNS code all still managed by staging

The remaining work for this task involves migrating some DNS settings from the "staging" definitions to the "production" ones.

This should be done by defining them in both places first; then importing the resources into the "production" definition before removing them from the staging state. See: https://www.terraform.io/cli/commands/state/rm and https://www.terraform.io/cli/commands/import

Finally we need to add the new DNS resources which are required. These will mimic the staging ones (e.g. NS records, A records etc.)

I believe this is all done now
Other tasks exist for refactorings

Tarrow claimed this task.