Right now we define a public subnet that instances can pull floating IPs from, and which has the overload NAT IP that non direct mapped address instances use (18.104.22.168). This would normally default to the interface of the labnet host itself and does so in the Neutron case as well.
|Open||None||T53494 Use Beta cluster as a true canary for code deployments (epic)|
|Open||None||T87220 Minimize infrastructure differences between Beta Cluster and production|
|Open||None||T196662 Set up LVS in beta like prod|
|Resolved||bd808||T166396 Program 1 Outcome 4: VPS hosting|
|Resolved||None||T167293 Nova-network to Neutron migration|
|Resolved||aborrero||T168580 Neutron implementation of routing_source_ip definition|
|Resolved||aborrero||T195786 cloudvps: labtestn: neutron issue with vxlan and l2population|
I am reopening for a bit of investigation on existing functionality. The currently set routing_source_ip does not appear in the router namespace as a secondary address. There is concern that this may have undesired effects.
The currently desired functionality here (there is overlap with T167357 as the two are intertwined):
- NAT outbound to the overload IP (routing_source_ip) if an instance with a fixed_address has no DNAT assignment
- Note: the destination cannot be excluded by dmz_cidr which preserves source IP (even if DNAT is specified)
- NAT to the DNAT IP if a floating ip is indeed assigned bidirectionally
- Note: the destination cannot be excuded by dmz_cidr which preserves source IP even if DNAT is specified -- this is maybe not ideal.
- Do not NAT if sourcing traffic from the virtual router itself
- This means pings from the namespace on labtestneutron2001 (if active)
- Floating IP NAT is bidirectional
- Router itself preserves external source IP (the qg* interface IP)
At the moment this seems to all work but it's possible there are situations where the routing_source_ip not being assigned is going to cause issues. It works now because the NAT transform works with the transit routing. i.e. the route to the supernet assigned to Neutron instances includes the floating IP block and there is no need for ARP. However, we manually assign the similarly used IP in the nova-network case in eqiad to teh loopback. We could do the same in the l3-agent code within the virtual router namespace for the IP assigned for routing_source_ip if needed.
Chase suggests that:
there is another thing that I think should probably work differently. i.e. dmz_cidr exclusions matching before floating IP outbound.
This could be reordering how rules are generated in modules/openstack/files/mitaka/neutron/l3/router_info.py.
Question: why 172.16.129.254 isn't assigned to any interface? (AKA routing_source_ip)
The core router has a static route which reads 172.16.128.0/21 nexthop 10.192.22.4. The machine with 10.192.22.4 is labtestneutron2001, which is the server who did the SNAT and which undoes it when a packet arrives at PREROUTING before, well, doing further routing.
The packet travels vlan 2120, which is cloud-transport1-b-codfw with addressing 10.192.22.0/24, and arrives labtestneutron2001 using the NIC chain eth1 -> eth1.2120 -> br-external -> tap666fcda7-04 -> qg666fcad7-04.
Ok @chasemp it took me a while to "interiorize" all this stuff, but I believe I see this clear now.
With the data and knowledge I have, I ACK the setup.
Open questions you mentioned:
- should we have the routing_source_ip assigned?
- should we get rid of dmz_cidr exclusions?
BTW I made several updates to our diagrams, fixing some network ranges which where wrong, adding vlan numbers and also adding a tab with a new diagram I find interesting for myself.