Page MenuHomePhabricator

Provision Kubernetes cluster and bastion using OpenTofu and Magnum
Open, In Progress, HighPublic

Description

Prove that gitops automation works by provisioning a tiny Kubernetes cluster (1 control node, 1 worker node) and a small bastion server in the zuul project using OpenTofu.

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
haproxy: Add module to provision HAProxy servicerepos/releng/zuul/tofu-provisioning!16bd808work/bd808/haproxymain
puppetserver: Manage Project Puppet settingsrepos/releng/zuul/tofu-provisioning!13bd808work/bd808/project-puppetmain
puppetserver: Provision a project local puppetserverrepos/releng/zuul/tofu-provisioning!11bd808work/bd808/puppetservermain
bastion: base64 encode ssh private keyrepos/releng/zuul/tofu-provisioning!9bd808work/bd808/ssh-provisionmain
tofu: add a bastion and a web proxyrepos/releng/zuul/tofu-provisioning!7bd808work/bd808/bastionmain
Customize query in GitLab

Related Objects

Event Timeline

bd808 changed the task status from Open to In Progress.
bd808 triaged this task as High priority.

I got to a new failure point.

$ sudo wmcs-openstack stack resource list b899cc99-bec8-47d7-8eb5-b5f31027bfb4
+---------------+----------------------+---------------+-----------------+------------------+
| resource_name | physical_resource_id | resource_type | resource_status | updated_time     |
+---------------+----------------------+---------------+-----------------+------------------+
| secgroup_rule | 15266470-1905-4113-  | OS::Neutron:: | CREATE_COMPLETE | 2025-06-         |
| _tcp_kube_min | 98b7-3ad6cbc9703d    | SecurityGroup |                 | 16T17:39:57Z     |
| ion_pods_cidr |                      | Rule          |                 |                  |
| secgroup_rule | 3ab08684-8ae6-48c3-  | OS::Neutron:: | CREATE_COMPLETE | 2025-06-         |
| _udp_kube_min | 879b-eaa3d6ba612a    | SecurityGroup |                 | 16T17:39:57Z     |
| ion           |                      | Rule          |                 |                  |
| kube_minions  | 0919918e-08ae-4744-  | OS::Heat::Res | CREATE_FAILED   | 2025-06-         |
|               | ab81-ae8e0394cf6c    | ourceGroup    |                 | 16T17:39:57Z     |
| etcd_address_ | b5f90da8-dcd6-4779-  | Magnum::ApiGa | CREATE_COMPLETE | 2025-06-         |
| lb_switch     | 9756-4e80999e823e    | tewaySwitcher |                 | 16T17:39:57Z     |
| worker_nodes_ | 71f657f7-6df5-49f4-  | OS::Nova::Ser | CREATE_COMPLETE | 2025-06-         |
| server_group  | 9cd0-3a53af27943d    | verGroup      |                 | 16T17:39:57Z     |
| secgroup_rule | faa9d7e8-c494-4fc8-  | OS::Neutron:: | CREATE_COMPLETE | 2025-06-         |
| _udp_kube_min | 826a-8e8027d56160    | SecurityGroup |                 | 16T17:39:57Z     |
| ion_pods_cidr |                      | Rule          |                 |                  |
| secgroup_rule | 0c5a48d5-6128-44fa-  | OS::Neutron:: | CREATE_COMPLETE | 2025-06-         |
| _tcp_kube_min | 86d9-05c7f3dbc0a4    | SecurityGroup |                 | 16T17:39:57Z     |
| ion           |                      | Rule          |                 |                  |
| secgroup_kube | b8956d31-5f4a-46f6-  | OS::Neutron:: | CREATE_COMPLETE | 2025-06-         |
| _minion       | 901d-589123c95937    | SecurityGroup |                 | 16T17:39:57Z     |
| api_address_f | e34ab906-af20-4097-  | Magnum::Float | CREATE_COMPLETE | 2025-06-         |
| loating_switc | 8b0d-aa2e23528f75    | ingIPAddressS |                 | 16T17:39:57Z     |
| h             |                      | witcher       |                 |                  |
| api_address_l | 3819d25d-d49e-4dca-  | Magnum::ApiGa | CREATE_COMPLETE | 2025-06-         |
| b_switch      | 9393-bc0e14948d9e    | tewaySwitcher |                 | 16T17:39:57Z     |
| kube_cluster_ | c4bd20d0-c59c-4e26-  | OS::Heat::Sof | CREATE_COMPLETE | 2025-06-         |
| deploy        | bb87-d70a7d9ffa8c    | twareDeployme |                 | 16T17:39:57Z     |
|               |                      | nt            |                 |                  |
| kube_cluster_ | 0b727aeb-6926-4c8c-  | OS::Heat::Sof | CREATE_COMPLETE | 2025-06-         |
| config        | 96da-5ecb5f46851d    | twareConfig   |                 | 16T17:39:57Z     |
| kube_masters  | 60860d10-59ec-4ba6-  | OS::Heat::Res | CREATE_COMPLETE | 2025-06-         |
|               | 8244-8fcafdab9a10    | ourceGroup    |                 | 16T17:39:57Z     |
| etcd_lb       | c4bfc4a4-12a6-4557-  | file:///usr/l | CREATE_COMPLETE | 2025-06-         |
|               | b631-35b7bde3bb7b    | ib/python3/di |                 | 16T17:39:57Z     |
|               |                      | st-packages/m |                 |                  |
|               |                      | agnum/drivers |                 |                  |
|               |                      | /common/templ |                 |                  |
|               |                      | ates/lb_etcd. |                 |                  |
|               |                      | yaml          |                 |                  |
| master_nodes_ | 5b0df656-6dd6-4d62-  | OS::Nova::Ser | CREATE_COMPLETE | 2025-06-         |
| server_group  | b685-df576f91ee56    | verGroup      |                 | 16T17:39:57Z     |
| secgroup_kube | b5a70905-c778-40de-  | OS::Neutron:: | CREATE_COMPLETE | 2025-06-         |
| _master       | b29d-a5c9561ea14c    | SecurityGroup |                 | 16T17:39:57Z     |
| api_lb        | 91c0516f-642b-439e-  | file:///usr/l | CREATE_COMPLETE | 2025-06-         |
|               | b30e-1f691f032a9c    | ib/python3/di |                 | 16T17:39:57Z     |
|               |                      | st-packages/m |                 |                  |
|               |                      | agnum/drivers |                 |                  |
|               |                      | /common/templ |                 |                  |
|               |                      | ates/lb_api.y |                 |                  |
|               |                      | aml           |                 |                  |
| network       | 7b30dafe-41ef-4b41-  | file:///usr/l | CREATE_COMPLETE | 2025-06-         |
|               | 9470-0c59149082ac    | ib/python3/di |                 | 16T17:39:57Z     |
|               |                      | st-packages/m |                 |                  |
|               |                      | agnum/drivers |                 |                  |
|               |                      | /common/templ |                 |                  |
|               |                      | ates/network. |                 |                  |
|               |                      | yaml          |                 |                  |
+---------------+----------------------+---------------+-----------------+------------------+
$ sudo wmcs-openstack stack resource show b899cc99-bec8-47d7-8eb5-b5f31027bfb4 kube_minions
+------------------------+-----------------------------------------------------+
| Field                  | Value                                               |
+------------------------+-----------------------------------------------------+
| updated_time           | 2025-06-16T17:39:57Z                                |
| creation_time          | 2025-06-16T17:39:57Z                                |
| logical_resource_id    | kube_minions                                        |
| resource_name          | kube_minions                                        |
| physical_resource_id   | 0919918e-08ae-4744-ab81-ae8e0394cf6c                |
| resource_status        | CREATE_FAILED                                       |
| resource_status_reason | OverLimit: resources.kube_minions.resources[0].reso |
|                        | urces.docker_volume:                                |
|                        | VolumeSizeExceedsAvailableQuota: Requested volume   |
|                        | or snapshot exceeds allowed gigabytes quota.        |
|                        | Requested 80G, quota is 80G and 80G has been        |
|                        | consumed. (HTTP 413) (Request-ID:                   |
|                        | req-633dd966-395c-4489-bdc3-29600b34e0ff)           |
| resource_type          | OS::Heat::ResourceGroup                             |
| links                  | [{'href': 'https://openstack.eqiad1.wikimediacloud. |
|                        | org:28004/v1/c26d9d326bdf464fa1025939ded7e5a2/stack |
|                        | s/zuul-k8s-v127-t4s2nsgmy6at/b899cc99-bec8-47d7-    |
|                        | 8eb5-b5f31027bfb4/resources/kube_minions', 'rel':   |
|                        | 'self'}, {'href': 'https://openstack.eqiad1.wikimed |
|                        | iacloud.org:28004/v1/c26d9d326bdf464fa1025939ded7e5 |
|                        | a2/stacks/zuul-k8s-v127-t4s2nsgmy6at/b899cc99-bec8- |
|                        | 47d7-8eb5-b5f31027bfb4', 'rel': 'stack'}, {'href':  |
|                        | 'https://openstack.eqiad1.wikimediacloud.org:28004/ |
|                        | v1/admin/stacks/zuul-k8s-v127-t4s2nsgmy6at-kube_min |
|                        | ions-h4k2oqa5ppnd/0919918e-08ae-4744-ab81-          |
|                        | ae8e0394cf6c', 'rel': 'nested'}]                    |
| required_by            | []                                                  |
| description            |                                                     |
| attributes             | {'refs': None, 'refs_map': None, 'attributes':      |
|                        | None, 'removed_rsrc_list': []}                      |
+------------------------+-----------------------------------------------------+

Looks like the default template wants us to have 80G of volume quota per Kubernetes node.

I got past the quota problem temporarily by adjusting the Magnum template. The 2 node test cluster almost provisioned this time. Things blew up when OpenTofu was trying to create a kubeconfig for the new cluster:

│ Error: Error building kubeconfig for openstack_containerinfra_cluster_v1 4eab7c1e-bad5-44c8-8178-1ca256a81588: Error getting certificate authority: Expected HTTP response code [200] when accessing [GET https://openstack.eqiad1.wikimediacloud.org:29511/v1/certificates/4eab7c1e-bad5-44c8-8178-1ca256a81588], but got 406 instead: {"errors": [{"request_id": "", "code": "", "status": 406, "title": "Not Acceptable", "detail": "Invalid service type for OpenStack-API-Version header", "links": []}]}
│
│   with openstack_containerinfra_cluster_v1.k8s_v127,
│   on magnum.tf line 8, in resource "openstack_containerinfra_cluster_v1" "k8s_v127":
│    8: resource "openstack_containerinfra_cluster_v1" "k8s_v127" {

The project now has:

  • 1 Kubernetes control plane instance
  • 1 Kubernetes worker instance
  • 1 web proxy exposing the Kubernetes control plane at https://zuul-k8s.wmcloud.org
  • 1 bastion host

I think the main thing left to figure out here is how to provision kubeconfig credentials on the bastion host to make things easier to debug in the cluster.

Note to self:

[23:59]  <    bd808> oh poop. now I have gitlab CI constraints to work around :/ masked and protected variables cannot contain whitespace and I have an ssh private key to get into the pipeline runtime :/
[23:59]  <thcipriani> base64
[00:00]  <    bd808> yeah, that will work. I just need to adjust some things.

@taavi suggested that if the cluster used IPv6 addresses it would be possible to talk to it from the production network without using the Cloud VPS https proxy service. I have been attempting to provision a Magnum cluster with IPv6, but I fear that it is not currently possible. The problem appears to be that the Magnum template needs both the fixed network and fixed subnet to attach instances to. When I set these to VXLAN/IPv6-dualstack and vxlan-dualstack-ipv6 the instances do not seem to be able to connect to the OpenStack user_data service at http://169.254.169.254/openstack/latest/user_data. I haven't been able to find a way to pass both the IPv6 and IPv4 subnets to the template.

Maybe the next best thing would be to implement a proxy service (HAProxy?) to sit between an IPv4 k8s cluster and the prod network? I do like the idea of avoiding the shared https proxy by using IPv6 addressing instead.

If Magnum doesn't support dual-stack clusters then I think I consider that a bug that should be fixed separately.

If Magnum doesn't support dual-stack clusters then I think I consider that a bug that should be fixed separately.

https://bugs.launchpad.net/magnum/+bug/2115298

Maybe the next best thing would be to implement a proxy service (HAProxy?) to sit between an IPv4 k8s cluster and the prod network? I do like the idea of avoiding the shared https proxy by using IPv6 addressing instead.

Ok. This looks like the thing that I need to do next. Of course things are not as simple as "just do that with tofu". There is a good pattern to follow in https://gitlab.wikimedia.org/repos/cloud/metricsinfra/tofu-provisioning, but to apply it in the zuul project I am also going to need to introduce a project local puppetserver so I can do ops/puppet.git related work without getting blocked on upstream merges. So the next block of work here is something like:

  • add tofu to provision a puppetserver
  • add tofu to make instances use the local puppetserver
  • write a profile::zuul::haproxy module (bikeshedding will probably happen on the namespace there)
  • add tofu to make an IPv6 addressable haproxy instance using the new Puppet manifest
  • profit!

I filed T397994: [tofu-cloudvps] Document using `cloudvps_puppet_project` to manage project-wide and instance specific puppet classes and hiera settings. Until there is a fix for that, the Project Puppet settings will need to be managed via Hiera.

This turned out to be a docs only issue. Tofu is now in control of the settings.

Change #1166006 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/puppet@production] zuul: Add profile::zuul::haproxy for Cloud VPS project

https://gerrit.wikimedia.org/r/1166006

Change #1166263 had a related patch set uploaded (by BryanDavis; author: Bryan Davis):

[operations/puppet@production] gitlab: Allow WMCS runners to talk to puppet-enc.cloudinfra

https://gerrit.wikimedia.org/r/1166263

Progress!

$ ping6 -c3 k8s-api.svc.zuul.eqiad1.wikimedia.cloud
PING6(56=40+8+8 bytes) 2001:470:b:530:24da:9d14:4e1d:40e0 --> 2a02:ec80:a000:1::2e8
16 bytes from 2a02:ec80:a000:1::2e8, icmp_seq=0 hlim=54 time=79.734 ms
16 bytes from 2a02:ec80:a000:1::2e8, icmp_seq=1 hlim=54 time=76.439 ms
16 bytes from 2a02:ec80:a000:1::2e8, icmp_seq=2 hlim=54 time=77.780 ms

--- k8s-api.svc.zuul.eqiad1.wikimedia.cloud ping6 statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 76.439/77.984/79.734/1.353 ms
$ curl -6k https://k8s-api.svc.zuul.eqiad1.wikimedia.cloud:6443
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}

These tests were run from my local laptop. The k8s-api.svc.zuul.eqiad1.wikimedia.cloud service name points at an HAProxy instance operating in Layer 4 mode. That reverse proxy connects via leastconn balancing to the active Magnum managed Kubernetes cluster master nodes. I haven't found a way to add additional SANS to the x508 certificate that is generated for the service, so client validation of the certificate isn't really possible at this point.

Change #1166263 merged by Dzahn:

[operations/puppet@production] gitlab: Allow WMCS runners to talk to puppet-enc.cloudinfra

https://gerrit.wikimedia.org/r/1166263

ferm restarted on all wmcs runners and verified they have the iptables rule now for enc-1.cloudinfra.eqiad1.wikimedia.cloud tcp dpt:https.

Change #1166006 merged by Dzahn:

[operations/puppet@production] zuul: Add profile::zuul::haproxy for Cloud VPS project

https://gerrit.wikimedia.org/r/1166006

TODO: request the config changes needed from WMCS to allow the zuul project to use more Ceph IOPS like T406271: Grant gitlab-runners-staging access to fast-iops volume type and a 4xiops instance flavor.