Page MenuHomePhabricator

Consider moving Tool Forge flannel backend to host-gw
Open, NormalPublic

Description

When we initially set up flannel for Tools Forge, we picked vxlan since it was the only realistic option - udp was just too slow, and there weren't any other options.

There's now host-gw, which I think is a great fit - and much better than vxlan for our use case. It requires that all the nodes be on the same layer2 substrate, which I think we are (and will continue to be even with Neutron, I think)?. It just adds route table entries in the hosts, and hence is much easier to debug and understand than whatever magic VXLAN is. It's also a bit faster.

ip route show from a host in a 6 node k8s cluster:

[root@doc-10 INSTALL]# ip route show
default via 10.1.1.46 dev em1  proto static  metric 100 
default via 128.32.67.1 dev em2  proto static  metric 101 
10.1.1.0/24 dev em1  proto kernel  scope link  src 10.1.1.230  metric 100 
10.244.0.0/24 dev cni0  proto kernel  scope link  src 10.244.0.1 
10.244.0.0/16 dev flannel.1 
10.244.1.0/24 via 10.1.1.231 dev em1 
10.244.2.0/24 via 10.1.1.232 dev em1 
10.244.3.0/24 via 10.1.1.233 dev em1 
10.244.4.0/24 via 10.1.1.234 dev em1 
10.244.5.0/24 via 10.1.1.235 dev em1 
10.244.6.0/24 via 128.32.67.62 dev em2 
128.32.67.0/24 dev em2  proto kernel  scope link  src 128.32.67.159  metric 100 
172.30.6.0/24 dev docker0  proto kernel  scope link  src 172.30.6.1 
blackhole 192.168.229.0/26  proto bird

You can see the /24s in the middle - those are just the flannel IPs. This has made my life *much* easier!

If Neutron changes the network topology in a way this isn't possible, then it doesn't make sense to move to this. Otherwise, it's a great benefit IMO.

Note that this isn't applicable to prod k8s at all.