Page MenuHomePhabricator

Forward our neutron-l3-agent routing hacks to Openstack Newton
Closed, ResolvedPublic

Description

We're currently running a substantially-customized router_info.py -- we need the same behavior in newton so that diff needs to be re-applied to the newton files. Unfortunately there's been a lot of change to the base file between Mitaka and Newton so it's not an easy patch.

Context for this is in T167357 and T168580; the actual changes for Mitaka are https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/422474/4/modules/openstack/files/mitaka/neutron/l3/router_info.py and https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/425252/

Event Timeline

Change 538707 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] neutron-l3-agent: forward our routing hacks to Newton

https://gerrit.wikimedia.org/r/538707

It looks like the purpose of the routing hacks is to disable NAT between the dmz_cidr's source:destination:

profile::openstack::eqiad1::neutron::dmz_cidr:
 - 172.16.0.0/21:91.198.174.0/24
 - 172.16.0.0/21:198.35.26.0/23
 - 172.16.0.0/21:10.0.0.0/8
 - 172.16.0.0/21:208.80.152.0/22
 - 172.16.0.0/21:103.102.166.0/24
 - 172.16.0.0/21:172.16.0.0/21

Typically I'd use a provider network for this, but since we're using flat networks I wonder if we could avoid the local neutron customizations and use address scopes.

When a router connects to an external network with matching address scopes, network traffic routes between without Network address translation (NAT).

https://docs.openstack.org/newton/networking-guide/config-address-scopes.html

Some additional context here related to the customizations: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Neutron#Neutron_customizations

Apart of the dmz_cidr, we use this also for our general egress setup (routing_source_ip). All network traffic exiting Neutron to the internet uses the very same source NAT IP address (more info here https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Neutron#Ingress_&_Egress)

I agree 100% in replacing this customization with any native mechanism from Neutron, instead of patching it.

If someone (arturo?) knows how to reliably forward the patch, I'm inclined to go with that for now and then refactor to other mechanisms post-upgrade, just in the interest of changing fewer things at a time. I don't know if forwarding the patch is something we can do without introducing unknowns though.

If the patch doesn't apply cleanly to Openstack Newton, I don't feel exactly confident with rewriting it. Specially because we won't probably allocate a 1 month period for me to study and test a rewrite of the patch.

Let me add another point of view: the patch was introduced just for the purpose of easing the nova-network to neutron migration, which was coupled with liberty -> mitaka and ubuntu -> debian.
There is no point in continuing supporting the hack going forward. It may be easier for us to migrate to a neutron native solution rather than carrying our own obscure code.

I think this is technical debt waving from the puppet repo heh :-P perhaps we should be kind to our future self and clean it!

Change 538707 abandoned by Andrew Bogott:
neutron-l3-agent: forward our routing hacks to Newton

Reason:
Probably we'll avoid this hack entirely, as per T233665

https://gerrit.wikimedia.org/r/538707

Subnet pools are a base requirement of address scopes, and unfortunately we cannot update existing subnets to use subnet pools on our version of OpenStack.

The code for on-boarding subnet pools into existing subnets was added in the Stein release https://github.com/openstack/neutron/commit/d5896025b78cfc1e4783c0c5231b9b39266ebf58

The other question I have about this hack is... do we need it? The issue I ran into that caused me to notice it was the dns-recursors not recognizing the source IPs, but that's quite easy for me to work around.

I suspect that for database and/or NFS access we need to know actual source IPs though... is that right?

(btw, I bet that exposing these IPs to production hosts breaks a lot of our 'future ideal model' rules, so if we can move towards total-outside-world-natting it might be considered forward progress in some circles)

I suspect that for database and/or NFS access we need to know actual source IPs though... is that right?

Yeah, we'll need to know the IPs / subnet for NFS clients

Change 539853 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: neutron: newton: refresh original files

https://gerrit.wikimedia.org/r/539853

Change 539856 had a related patch set uploaded (by Arturo Borrero Gonzalez; owner: Arturo Borrero Gonzalez):
[operations/puppet@production] openstack: newton: neutron: introduce WMF patches

https://gerrit.wikimedia.org/r/539856

Change 539853 merged by Andrew Bogott:
[operations/puppet@production] openstack: neutron: newton: refresh original files

https://gerrit.wikimedia.org/r/539853

Change 539856 merged by Andrew Bogott:
[operations/puppet@production] openstack: newton: neutron: introduce WMF patches

https://gerrit.wikimedia.org/r/539856

I noticed this issue because of the source IP that was detected by the dns recursor on cloudservices2002-dev. After this change, things are slightly worse:

# telnet 208.80.153.78 53
Trying 208.80.153.78...
telnet: Unable to connect to remote host: No route to host

I haven't done comprehensive restarts, though.

Things look better after that last patch

I'm reviewing the neutron setup in codfw1dev following this checklist: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Deployment_sanity_checklist#General_networking_&_neutron
I noticed something weird (possibly a bug somewhere) when checking the dmz_cidr setting and how that translates to iptables rules.

This is the setting as we have it in puppet. It should generate an iptables rule for each pair, in order to skip the NAT.

user@cloudnet2003-dev:~ $ sudo grep dmz_cidr /etc/neutron/l3_agent.ini 
dmz_cidr = 172.16.128.0/24:10.0.0.0/8,172.16.128.0/24:208.80.155.0/22

Checking the resulting iptables rules, something is wrong for the last pair:

user@cloudnet2002-dev:~ $ sudo ip netns exec qrouter-5712e22e-134a-40d3-a75a-1c9b441717ad iptables-save -c | grep 172.16.128.0 | grep 10.0.0.0
[0:0] -A neutron-l3-agent-POSTROUTING -s 172.16.128.0/24 -d 10.0.0.0/8 -j ACCEPT

user@cloudnet2002-dev:~ $ sudo ip netns exec qrouter-5712e22e-134a-40d3-a75a-1c9b441717ad iptables-save -c | grep 172.16.128.0 | grep 208.80.155.0/22
[..nothing..]

user@cloudnet2002-dev:~ $ sudo ip netns exec qrouter-5712e22e-134a-40d3-a75a-1c9b441717ad iptables-save -c | grep 172.16.128.0
[0:0] -A neutron-l3-agent-POSTROUTING -s 172.16.128.0/24 -d 10.0.0.0/8 -j ACCEPT
[0:0] -A neutron-l3-agent-POSTROUTING -s 172.16.128.0/24 -d 208.80.152.0/22 -j ACCEPT

It seems the rule for 208.80.155.0/22 is installed as 208.80.152.0/22. I tried wiping the ruleset, restarting the agents, etc. The rules are re-created the same, i.e, this is something happening reproducibly.

Mentioned in SAL (#wikimedia-cloud) [2019-10-02T11:08:42Z] <arturo> codfw1dev rebooting cloudnet2002-dev and cloudnet2003-dev for testing T233665

I put the l3-agent in debug mode and it reports creating the rule correctly, no mention to the wrong rule in the logs:

aborrero@cloudnet2002-dev:~ $ sudo grep 208.80.155.0 /var/log/neutron/neutron-l3-agent.log 
2019-10-02 11:36:33.993 8177 DEBUG oslo_service.service [req-0b8539eb-04e3-4ff6-8add-45549b86dc08 - - - - -] dmz_cidr                       = 172.16.128.0/24:10.0.0.0/8,172.16.128.0/24:208.80.155.0/22 log_opt_values /usr/lib/python2.7/dist-packages/oslo_config/cfg.py:2618
2019-10-02 11:36:34.082 8177 DEBUG neutron.wsgi [-] dmz_cidr                       = 172.16.128.0/24:10.0.0.0/8,172.16.128.0/24:208.80.155.0/22 log_opt_values /usr/lib/python2.7/dist-packages/oslo_config/cfg.py:2618
2019-10-02 11:36:44.225 8177 DEBUG neutron.agent.l3.router_info [-] foo self.external_gateway_nat_fip_rules rule: ('POSTROUTING', '-s 172.16.128.0/24 -d 208.80.155.0/22 -j ACCEPT') _add_snat_rules /usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py:828
aborrero@cloudnet2002-dev:~ $ sudo grep 208.80.152.0 /var/log/neutron/neutron-l3-agent.log 
[..nothing..] 

Honestly I have no idea where is that 208.80.152.0 coming from.

Mentioned in SAL (#wikimedia-cloud) [2019-10-02T12:47:50Z] <arturo> codfw1dev deleting all VMs in the deployment for mangling the network config for testing T233665

Mentioned in SAL (#wikimedia-cloud) [2019-10-02T12:49:20Z] <arturo> codfw1dev delete all floating ip allocations in the deployment for mangling the network config for testing T233665

Honestly I have no idea where is that 208.80.152.0 coming from.

208.80.152.0/22 is the network starting point for the CIDR 208.80.155.0/22. Technically they're the same thing, but 208.80.152.0/22 is easier to read.

Address:   208.80.155.0         11010000.01010000.100110 11.00000000
Netmask:   255.255.252.0 = 22   11111111.11111111.111111 00.00000000
Wildcard:  0.0.3.255            00000000.00000000.000000 11.11111111
=>
Network:   208.80.152.0/22      11010000.01010000.100110 00.00000000
HostMin:   208.80.152.1         11010000.01010000.100110 00.00000001
HostMax:   208.80.155.254       11010000.01010000.100110 11.11111110
Broadcast: 208.80.155.255       11010000.01010000.100110 11.11111111
Hosts/Net: 1022                  Class C

Thanks @JHedden. In this case it helped me review every bit of the setup. I cleaned a lot of stuff and documented a bunch of missing bits about the network setup for codfw1dev in https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Neutron

The neutron DB was migrated from a previous deployment, and there are plenty of leftovers, so I think I will delete all the objects (router, subnet, net) and create them again with current names etc just for sanity.

Mentioned in SAL (#wikimedia-cloud) [2019-10-02T15:23:14Z] <arturo> codfw1dev renaming net/subnet objects to a more modern naming scheme T233665

I created a VM and tried both dmz_cidr and routing_source_ip:

Trying routing_source_ip:

root@aborrero-test:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
[..packet is filtered in core routers, but it still traverses the neutron router..]

aborrero@cloudnet2002-dev:~ $ sudo conntrack -E -j
    [NEW] icmp     1 30 src=172.16.128.20 dst=8.8.8.8 type=8 code=0 id=1312 [UNREPLIED] src=8.8.8.8 dst=172.16.128.20 type=0 code=0 id=1312
    [NEW] icmp     1 30 src=172.16.129.254 dst=8.8.8.8 type=8 code=0 id=1312 [UNREPLIED] src=8.8.8.8 dst=172.16.129.254 type=0 code=0 id=1312

[..packet gets the routing_source_ip NAT applied ..]

Trying dmz_cidr:

root@aborrero-test:~# ping cloudcontrol2001-dev.wikimedia.org
PING cloudcontrol2001-dev.wikimedia.org (208.80.153.59) 56(84) bytes of data.
64 bytes from cloudcontrol2001-dev.wikimedia.org (208.80.153.59): icmp_seq=13 ttl=62 time=1.36 ms
[..]

root@cloudcontrol2001-dev:~# tcpdump -i any icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
15:54:40.727444 IP 172.16.128.20 > cloudcontrol2001-dev.wikimedia.org: ICMP echo request, id 1304, seq 13, length 64
15:54:40.727482 IP cloudcontrol2001-dev.wikimedia.org > 172.16.128.20: ICMP echo reply, id 1304, seq 13, length 64

[..destination server sees the VM address..]