Page MenuHomePhabricator

Experiment with CloudVPS opentofu provider for Infrastructure as code
Closed, ResolvedPublic5 Estimated Story PointsSpike

Description

Documentation is https://wikitech.wikimedia.org/wiki/Help:Using_OpenTofu_on_Cloud_VPS

We want to use opentofu to manage configuration. Can/Should we also use it to manage infrastructure?

AC:

There's a document covering the following topics:

  • What's needed to connect in terms of open ports/network access
  • Try to spin up a VM, access it, delete it, and recreate it
  • Try to create a volume, mount it, use it, delete it, and recreate it
  • What other resources are available here? Is it limited in any way vs horizon (web proxies, object storage, cloud-init configs, etc)?

Event Timeline

thcipriani set the point value for this task to 5.
thcipriani moved this task from Backlog to Ready on the Catalyst (Wawa Nasa) board.

Possibly useful merge request from the Pixel re-write that enabled the entirety of the VPS to be brought up from scratch including extra storage and web proxy:

use tofu / cloud-config combo for bringing up vps

https://gitlab.wikimedia.org/mhurd/pixel-clean/-/merge_requests/3

  • Dockerized tofu / openstack
  • Can import existing tofu state via generate_imports_file script
  • Configures, attaches, formats and uses 80GB volume for both the Pixel runner and Docker's storage
  • Creates pixel-clean.wmcloud.org web proxy pointing to the instance
Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptJun 30 2025, 4:05 PM

From my investigation into this, I found the following about potentially using OpenTofu for infrastructure and configuration management on Catalyst:

What's needed to connect in terms of open ports/network access:

  • OpenStack credentials (found openrc.sh file with env vars)
  • Credential ID
  • Credential Secret
  • Openstack Authentication URL (in openrc.sh from Horizon)
  • Image flavour and name for instance creation (found via Horizon UI)
  • The target project ID (found in Identity > Projects)
  • Network access to the OpenStack API endpoint from wherever you run tofu
  • Provider details in main.tf (or provider.tf) as seen below for having both openstack and cloudvps providers.
provider "openstack" {
  auth_url                      = var.os_auth_url
  tenant_id                     = var.os_project_id
  application_credential_id     = var.os_application_credential_id
  application_credential_secret = var.os_application_credential_secret
}

provider "cloudvps" {
  os_auth_url                     = var.os_auth_url
  os_application_credential_id    = var.os_application_credential_id
  os_application_credential_secret = var.os_application_credential_secret
  os_project_id                   = var.os_project_id
}

Steps to horizontally scale previously were very complicated especially when it came to volume attachment and mounting as well as linking instances to the volumes created. This is detailed in the scaling documentation. Compared to the process we needed to do for that, it is clear that Tofu would be an improvement since all of that can be accomplished via a few lines of code.

OpenTofu vs Horizon

ResourceOpenTofuHorizon (UI)Notes
VMs,VolumesAutomatically done and further mounting can be done using config scriptsManually done, need explicit linking/mounting of volumes and VMs. Difficult to swap/scale.Quicker instance and volume spin up and linkage when Tofu is used.
Object storagePossible using openstack_objectstorage_*Manually using Object store > Containers > Create container
Web proxiesDone using resource "cloudvps_web_proxy"Can be manually defined via UI
Load balancingCan be done using load balancer **Manually using UI**under AWS.
Cloud-init configsCan be used and referred to upon instance creation.Can be defined during instance creation as an upload or information pasted in ‘configuration’ tab.Tofu can do this much easier.
Security groupsCan be defined during instance creation using a simple variable referring to the group names : security_groups = ["default", "web"] Can be done during instance creation or edited afterwards using the UI.

Upsides of using OpenTofu:

  • Easy resource recovery :
    • delete VM in Horizon → tofu apply recreates it
  • (Almost immediate) instance recreation whole stack: quick disaster recovery
  • Lower risk of mistakes -> tofu plan shows the intended actions *before* they are done. You will get a preview of all it will do and have to give explicit instructions to proceed when using tofu apply. These guardrails mean less likelihood of mistakes.

Easier scaling without needing manual effort.

Downsides of using OpenTofu:

  • Slight learning curve as compared to the UI on horizon.
  • Some features are useful only after you retrieve the needed information from Horizon so it takes some upfront effort to get to the effortless part of things.
  • (minor issue) Incompatible CloudVPS provider for arm64.
  • When using a cloud-init script, if any errors occur this can make ssh access fail or take time to be available. It is

Proposed enhancements to the standard Tofu use case:
As @Mhurd mentioned above, the new iteration of Pixel uses a OpenTofu. One of the notable differences is it is also using a Dockerfile to run a Tofu Docker and defines an entrypoint for all these actions to be possible even when one is not doing this on their own device.

He defined it here .

This could be useful for us because:

  • Portable environment → run tofu anywhere, even on CI runners or dev laptops
  • Same version of tofu → no “works on my machine” discrepancies.
  • Can include config scripts in a cleaner manner.

A more detailed version of this information (including several example images) is here.

EBomani changed the task status from Open to In Progress.Jul 11 2025, 10:32 PM

Nice! Looks like we were able to do what the documentation says and tried it a few times and ran into no major problems. Thanks for documenting our use-case @EBomani !