Page MenuHomePhabricator

Routed Ganeti: same node DHCP limitation
Closed, ResolvedPublic

Description

Context's context : https://phabricator.wikimedia.org/phame/post/view/312/ganeti_on_modern_network_design/
and T300152#9502030

Context :
Currently, when starting a VM in a routed Ganeti cluster, a isc-dhcp-relay daemon is brought up to relay DHCP queries between the VM and the configured install server.
This is done on that line: https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/profile/templates/ganeti/net-common.erb#205

Unfortunately isc-dhcp-relay requires to specify the -U (dhcp server facing interface) and -i (vm facing interface) for our use-case. Failing from doing so works in theory, but won't set the giaddr to the node's external interface. Instead it will use the VM facing IP.
In other circumstances, that would make sens, but to have isc-dhcpd hand over /32 IPs we need to have giaddr set to the node's IP address.

when giaddr is set to the VM facing NIC
2025-06-13T10:01:54.400276+00:00 install7002 dhcpd[257748]: DHCPDISCOVER from aa:00:00:50:bc:fb via 10.140.2.1: network 10.140.2.0/24: no free leases
when giaddr is set to the hypervisors main IP
2025-06-12T11:44:34.692886+00:00 install7002 dhcpd[173704]: DHCPDISCOVER from aa:00:00:90:40:c0 via 10.140.0.11
2025-06-12T11:44:34.693070+00:00 install7002 dhcpd[173704]: DHCPOFFER on 10.140.2.7 to aa:00:00:90:40:c0 via 10.140.0.11

This has been working fine... ...until now.

When setting up magru, we eventually ended up in the case where the install server (running the DHCP server) was on the same hypervisor than the VM that was being created.

In that case, the reply from the install server logically enters from the install server's tap interface, and not from the expected eno12399np0 interface defined using the -U flag of dhcrelay

This causes dhcrelay to ignore the reply.

2025-06-13T07:43:06.703195+00:00 ganeti7001 dhcrelay: Forwarded BOOTREQUEST for aa:00:00:50:bc:fb to 195.200.68.100
2025-06-13T07:43:06.703265+00:00 ganeti7001 dhcrelay: Dropping reply received on tap1

In the short term, to finish setting up magru, one of the possible workaround is to migrate the install server or the VM being setup to a different hypervisor of the same cluster, or manually run the proper dhcrelaycommand.

There are multiple ways of fixing it.
For example we could run a bash or python script at VM creation time, to check if the configured install server is on the same hypervisor then get its tap interface for dhcrelay configuration. But this seems a bit brittle.

I think the most reliable way is to have a better DHCP relay daemon.

I looked around at the options once more, and a few stand out:

Event Timeline

Change #1180724 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Create temp test VM in magru

https://gerrit.wikimedia.org/r/1180724

Change #1180724 merged by Ayounsi:

[operations/puppet@production] Create temp test VM in magru

https://gerrit.wikimedia.org/r/1180724

Thanks to Simon, the developer of dnsmasq, the latest version of dnsmasq handles our usecase. Thanks a lot to him for implementing it pro-bono!
The used test command was sudo ./dnsmasq --no-daemon --port 0 --dhcp-split-relay=10.140.2.1,195.200.68.100,10.140.0.11 --log-dhcp and it should be possible to add extra --dhcp-split-relay arguments for the public and sandbox vlans.
It's currently on the v2.92test21 tag (available in https://thekelleys.org.uk/dnsmasq/test-releases/)

Next steps:

  1. wait for v2.92 to be released soon-ish, then either package it ourself for Bookworm (or backport it from https://packages.debian.org/sid/dnsmasq )
  2. Puppetize it
  3. Test it further

Change #1181505 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Routed Ganeti: switch to dnsmasq for DHCP relay

https://gerrit.wikimedia.org/r/1181505

Mentioned in SAL (#wikimedia-operations) [2026-01-12T11:09:12Z] <moritzm> uploaded dnsmasq 2.92-rc3 to bookworm-wikimedia/main T396864

Change #1181505 merged by Ayounsi:

[operations/puppet@production] Routed Ganeti: switch to dnsmasq for DHCP relay

https://gerrit.wikimedia.org/r/1181505

Change #1225561 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/puppet@production] Routed ganeti: ensure dnsmasq is installed before being used

https://gerrit.wikimedia.org/r/1225561

Change #1225561 merged by Ayounsi:

[operations/puppet@production] Routed ganeti: ensure dnsmasq is installed before being used

https://gerrit.wikimedia.org/r/1225561

Since nothing on Bookworm uses dnsmasq we've uploaded dnsmasq 2.92-rc3 to "main".

We'd like to do the same for trixie as well, but there are two classes of hosts which currently use 2.91 on trixie:

  • cloudvirt* hosts use dnsmasq-base, which seems to get pulled in by libvirt-daemon-driver-network
  • cloudnet* hosts use dnsmasq-utils, which gets pulled in by neutron-dhcp-agent

Change #1226239 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] ganeti/magru: Switch to dnsmasq as the DHCP relay

https://gerrit.wikimedia.org/r/1226239

Change #1226240 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] ganeti/esams: Switch to dnsmasq as the DHCP relay

https://gerrit.wikimedia.org/r/1226240

Change #1226239 merged by Ayounsi:

[operations/puppet@production] ganeti/magru: Switch to dnsmasq as the DHCP relay

https://gerrit.wikimedia.org/r/1226239

Change #1226240 merged by Muehlenhoff:

[operations/puppet@production] ganeti/esams: Switch to dnsmasq as the DHCP relay

https://gerrit.wikimedia.org/r/1226240

Change #1226285 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Unconditionally use dnsmasq on routed Ganeti

https://gerrit.wikimedia.org/r/1226285

Change #1226285 merged by Ayounsi:

[operations/puppet@production] Unconditionally use dnsmasq on routed Ganeti

https://gerrit.wikimedia.org/r/1226285

ayounsi claimed this task.

This problem is now solved thanks to the most recent dnsmasq

Mentioned in SAL (#wikimedia-operations) [2026-01-23T12:32:44Z] <moritzm> uploaded dnsmasq 2.92-1~wmf12u to bookworm-wikimedia/main T396864

I'm reopening since we still need to future-proof this by also getting dnsmasq 2.92 into trixie-wikimedia

I've also just uploaded a bookworm backport of the final 2.92 release (instead of the -rc3 we previously ran)

Mentioned in SAL (#wikimedia-operations) [2026-02-05T09:04:23Z] <moritzm> update hosts running routed Ganeti to dnsmasq 2.92-1~wmf12u1 T396864

Mentioned in SAL (#wikimedia-operations) [2026-02-10T16:40:10Z] <moritzm> uploaded dnsmasq 2.92-1~wmf12 to trixie-wikimedia/main T396864

Mentioned in SAL (#wikimedia-operations) [2026-02-12T14:37:55Z] <moritzm> upgrading cloudvirt* to dnsmasq 2.92 T396864

Mentioned in SAL (#wikimedia-operations) [2026-02-12T14:49:03Z] <moritzm> upgrading cloudnet* to dnsmasq 2.92 T396864

We're now on dnsmasq 2.92 across the fleet and I've upload the builds for bookworm and trixie to "main", i.e. they are now the default.

Change #1255718 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Remove testvm7001

https://gerrit.wikimedia.org/r/1255718

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: testvm7001.magru.wmnet

  • testvm7001.magru.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster magru03 to Netbox
    • Removed from DebMonitor
    • Removed from Puppet server and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster magru03 to Netbox

Change #1255718 merged by Muehlenhoff:

[operations/puppet@production] Remove testvm7001

https://gerrit.wikimedia.org/r/1255718