Page MenuHomePhabricator

Consider moving Tool Forge flannel backend to host-gw
Closed, DeclinedPublic

Description

When we initially set up flannel for Tools Forge, we picked vxlan since it was the only realistic option - udp was just too slow, and there weren't any other options.

There's now host-gw, which I think is a great fit - and much better than vxlan for our use case. It requires that all the nodes be on the same layer2 substrate, which I think we are (and will continue to be even with Neutron, I think)?. It just adds route table entries in the hosts, and hence is much easier to debug and understand than whatever magic VXLAN is. It's also a bit faster.

ip route show from a host in a 6 node k8s cluster:

[root@doc-10 INSTALL]# ip route show
default via 10.1.1.46 dev em1  proto static  metric 100 
default via 128.32.67.1 dev em2  proto static  metric 101 
10.1.1.0/24 dev em1  proto kernel  scope link  src 10.1.1.230  metric 100 
10.244.0.0/24 dev cni0  proto kernel  scope link  src 10.244.0.1 
10.244.0.0/16 dev flannel.1 
10.244.1.0/24 via 10.1.1.231 dev em1 
10.244.2.0/24 via 10.1.1.232 dev em1 
10.244.3.0/24 via 10.1.1.233 dev em1 
10.244.4.0/24 via 10.1.1.234 dev em1 
10.244.5.0/24 via 10.1.1.235 dev em1 
10.244.6.0/24 via 128.32.67.62 dev em2 
128.32.67.0/24 dev em2  proto kernel  scope link  src 128.32.67.159  metric 100 
172.30.6.0/24 dev docker0  proto kernel  scope link  src 172.30.6.1 
blackhole 192.168.229.0/26  proto bird

You can see the /24s in the middle - those are just the flannel IPs. This has made my life *much* easier!

If Neutron changes the network topology in a way this isn't possible, then it doesn't make sense to move to this. Otherwise, it's a great benefit IMO.

Note that this isn't applicable to prod k8s at all.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 6 2017, 2:08 AM
bd808 triaged this task as Medium priority.Jun 6 2017, 3:14 PM
bd808 closed this task as Declined.Tue, Mar 3, 6:19 PM
bd808 added a subscriber: bd808.

We went with Calico, HAProxy, and nginx-ingress to create the bare metal ingress and east-west overlay network in the 2020 Kubernetes cluster -- https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Networking_and_ingress