Page MenuHomePhabricator

Please create two Ganeti VMs for Wikidough in eqsin
Closed, ResolvedPublic

Description

Please create two Ganeti VMs for Wikidough, with the following identical parameters in eqsin.

Specifications:

Hostname: doh5001, doh5002 (eqsin)
vCPUs: 2
Memory: 8
Disk: 10G
Network: Public

Event Timeline

Change 698014 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] acme_chief: authorize doh500* hosts for Wikidough

https://gerrit.wikimedia.org/r/698014

Change 698013 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] site: add wikidough eqsin with insetup role

https://gerrit.wikimedia.org/r/698013

Change 698013 merged by Dzahn:

[operations/puppet@production] site: add wikidough eqsin with insetup role

https://gerrit.wikimedia.org/r/698013

doh5001 has been created but doh5002 hit resource limits here as well, even though we just used 10G disk, it is maybe another resource:

dzahn@cumin1001:~$ sudo cookbook sre.ganeti.makevm --vcpus 2 --memory 8 --disk 10 --network public eqsin doh5002
Ready to create Ganeti VM doh5002.wikimedia.org in the ganeti01.svc.eqsin.wmnet cluster on row 1 with 2 vCPUs, 8GB of RAM, 10GB of disk in the public network.
>>> Is this correct?
Type "go" to proceed or "abort" to interrupt the execution
> go
START - Cookbook sre.ganeti.makevm for new host doh5002.wikimedia.org
Exception raised while executing cookbook sre.ganeti.makevm:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/spicerack/_menu.py", line 234, in run
    raw_ret = runner.run()
  File "/srv/deployment/spicerack/cookbooks/sre/ganeti/makevm.py", line 126, in run
    ip_v4_data = prefix_v4.available_ips.create({})
  File "/usr/lib/python3/dist-packages/pynetbox/core/endpoint.py", line 445, in create
    req = Request(**self.request_kwargs).post(data)
  File "/usr/lib/python3/dist-packages/pynetbox/core/query.py", line 389, in post
    return self._make_call(verb="post", data=data)
  File "/usr/lib/python3/dist-packages/pynetbox/core/query.py", line 268, in _make_call
    raise AllocationError(req)
pynetbox.core.query.AllocationError: The requested allocation could not be fulfilled.
END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh5002.wikimedia.org

@ssingh @BBlack Our issue over here is lack of the resource of .. public IPs, it looks:

13729   File "/usr/lib/python3/dist-packages/spicerack/_menu.py", line 234, in run
...
13731   File "/srv/deployment/spicerack/cookbooks/sre/ganeti/makevm.py", line 126, 
13732     ip_v4_data = prefix_v4.available_ips.create({})
13733   File "/usr/lib/python3/dist-packages/pynetbox/core/endpoint.py", line 445, 
...
13735   File "/usr/lib/python3/dist-packages/pynetbox/core/query.py", line 389, in post
13736     return self._make_call(verb="post", data=data)
...
13738     raise AllocationError(req)

Change 698047 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] DHCP: add doh5001 MAC, add doh[2345] to partman regex

https://gerrit.wikimedia.org/r/698047

colewhite triaged this task as Medium priority.Jun 3 2021, 10:42 PM

Change 698047 merged by Dzahn:

[operations/puppet@production] DHCP: add doh5001 MAC, add doh[2345] to partman regex

https://gerrit.wikimedia.org/r/698047

@ssingh doh5001.wikimedia.org is ready for you now. doh5002 on hold for lack of IP in that subnet.

@ssingh doh5001.wikimedia.org is ready for you now. doh5002 on hold for lack of IP in that subnet.

Thanks @Dzahn! Adding @ayounsi to the task as well so that he knows, in case the ping from bblack was missed.

Change 698014 merged by Ssingh:

[operations/puppet@production] acme_chief: authorize doh5001 host for Wikidough

https://gerrit.wikimedia.org/r/698014

@ssingh Should we keep this ticket open and make a new one "get a free IP in eqsin"?

@MoritzMuehlenhoff is it possible to decom bast5001 ? or at least move it to the private vlan?

@ssingh otherwise you can use 103.102.166.5 so we don't block you

@ayounsi I don't know how we could use that IP for a VM. When running the cookbook to create a VM it simply fails to get a free IP and there is no option or expectation anymore that the user defines the IP to be used.

I think we can simply decom bast5001 for now (it can still be reinstalled under a new name later), but let's hear if @BBlack has some objection.

@ayounsi I don't know how we could use that IP for a VM. When running the cookbook to create a VM it simply fails to get a free IP and there is no option or expectation anymore that the user defines the IP to be used.

Indeed, you can delete the reserved one from https://netbox.wikimedia.org/ipam/prefixes/28/ip-addresses/

But if bast5001's IP can be re-used it's cleaner to me.

Ah, gotcha! Yes, thank you for moving this forward.

Mentioned in SAL (#wikimedia-operations) [2021-08-05T13:39:43Z] <mutante> deleted reserved (not active) IP 103.102.166.5/28 from netbox (T284246)

Change 710269 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] acme_chief: allow doh5002 to request wikidough certs

https://gerrit.wikimedia.org/r/710269

Change 710271 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] DHCP: add MAC address of doh5002

https://gerrit.wikimedia.org/r/710271

Change 710271 merged by Dzahn:

[operations/puppet@production] DHCP: add MAC address of doh5002

https://gerrit.wikimedia.org/r/710271

Change 710298 had a related patch set uploaded (by Dzahn; author: Dzahn):

[operations/puppet@production] site: add doh5002 with insetup role

https://gerrit.wikimedia.org/r/710298

Change 710269 merged by Dzahn:

[operations/puppet@production] acme_chief: allow doh5002 to request wikidough certs

https://gerrit.wikimedia.org/r/710269

Change 710298 merged by Dzahn:

[operations/puppet@production] site: add doh5002 with insetup role

https://gerrit.wikimedia.org/r/710298

I deleted the reserved IP mentioned above and then could run the cookbook again.

VM has been created now, has been added to DHCP and OS installed. puppet "insetup" role is being applied right now and the new host was added to the regex to allow it to request certs from acme_chief.

Applying the actual wikidough role I'll leave to you, @ssingh

Thanks very much for the help, @Dzahn!