Problem
It seems some of the config we have deployed on the new switches in codfw is preventing hosts being reimaged on the public vlans in those rows.
Specifically the problem stems from the decision to not configure a unique-IP on each switch in these vlans, to conserve IP address space. As a reminder the normal "anycast gw" config we have for vlans that traverse multiple boxes is like this (note the GW IP is configured as a 'virtual-gateway-address'):
set interfaces irb unit 2018 virtual-gateway-accept-data set interfaces irb unit 2018 description "Subnet private1-b-codfw" set interfaces irb unit 2018 family inet address 10.192.16.84/22 preferred set interfaces irb unit 2018 family inet address 10.192.16.84/22 virtual-gateway-address 10.192.16.1
To avoid having to allocate an IP from a public vlan across every top-of-rack switch in the fabric, we use a different config for the public vlan gateways, simply configuring the gateway IP as the main IP on every switch:
set interfaces irb unit 2002 description "Subnet public1-b-codfw" set interfaces irb unit 2002 family inet address 208.80.153.33/27
The problem here arises when the switch DHCP relay happens on the public vlan. The switch chooses the single, shared IP to source the relayed packets from, which duly hit the install server. However the reply is not routed back to the correct switch in most cases, as when it hits the spine layer the spine has a route to that IP from every leaf in the row, and picks one randomly.
Potential Solutions
There are two potential solutions I could think of:
- Configure the public IRB GWs like we do the private, and allocate an additional, separate IP for each switch which
- If we do this for all 8 switches in each row that's a lot of waste
- We could also only configure the GW at the Spine layer, but that makes routing quite inefficient and means the L3 GW is not on the connected device (and thus potential layer-2 complexities)
- Configure the switch to use a different IP when relaying DHCP requests to the install server
On the latter there do appear to be config options that allow this:
set routing-instances PRODUCTION forwarding-options dhcp-relay group dhcp_relay source-ip-change
The above config causes the packet to be sent from the switch's lo0.5000 IP, which is unique to each device. Right now those requests are blocked by the install server in iptables, but I also suspect that ISC DHCPd might not respond if the DHCP packet comes from a subnet it's not configured for.
Overall the second option seems better and more scalable if it can be made work.