Neutron implements virtual routers using linux network namespaces.
The failover can be improved by setting some nf_conntrack sysctl parameters, specifically:
- nf_conntrack_tcp_loose (understand already established connections) (this is activated by default)
- nf_conntrack_tcp_be_liberal (disable most TCP checks and improve chances that conntrack understand an already established connection as valid)
I ran some experiments and this directly affects the recovery of failovered NATed TCP connections in the other neutron router.
However, given the neutron virtual router runs in a netns, the sysctl configuration isn't shared with the host system. We need a way to ensure these sysctl parameters are always set up correctly in the auto-generated neutron netns.
My plan to introduce a small python daemon that watches netns creation events and sets proper sysctl parameters, hopefully before traffic starts flowing.