Page MenuHomePhabricator

Modify Bird module to allow source IP to be passed to template
Closed, ResolvedPublic

Description

To progress the work on the cloudlb POC we need to be adjust the existing puppet bird module so that the source IP for the BGP session on the server side is not fixed to the "ip_address" puppet fact.

The cloudlb's have a separate logical/vlan interface for connection to the cloud-vrf/realm:

cmooney@cloudlb2001-dev:~$ ip -br addr show | grep eno1
eno1             UP             10.192.20.8/24 2620:0:860:118:10:192:20:8/64 fe80::32e1:71ff:fe60:e97c/64 
vlan2151@eno1    UP             172.20.5.2/24 fe80::32e1:71ff:fe60:e97c/64

They need to use the IP address from the vlan interface to establish the BGP session, but it's getting set to the other one:

cmooney@cloudlb2001-dev:~$ sudo grep local /etc/bird/bird.conf
    local 10.192.20.8 as 64605;
    local 10.192.20.8 as 64605;

Creating this task to track progress here. The obvious solution is to allow passing of the specific IP to use, and default to $facts['ipaddress'] if that's not present. Happy to discuss, however. If that's cumbersome there could be other approaches we could try.

Event Timeline

cmooney triaged this task as Medium priority.May 2 2023, 11:05 AM
cmooney created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I plan to do T335759: cloud-private subnet: introduce new domain then we can specify the FQDN to use for the bird config. Otherwise I think we would need to hardcode the raw IP address somewhere in the puppet tree which I'd rather avoid.

@aborrero hey.

Yeah I can understand why having to hardcode the IPs in the puppet tree is not a great option.

Unfortunately I don't think we can use an FQDN in the bird.conf file though, the local parameter in the Bird BGP protocol only allows an ip to be specified. It's optional, however, so another approach might be to simply not specify the source local IP in the config. The system will default to using the correct one as it's the only IP on that interface.

@ayounsi not sure on your thoughts on that? I appreciate we need to specify the local IP on hosts other than cloudlb that use this, so perhaps a new boolean option to include it or not makes sense?

yeah I'm thinking about doing something like resolve_ipv4(whateverserver.codfw.hw.wikimedia.cloud), so basically let puppet resolve the address. The bird config will still get the raw IPv4 but we code the FQDN in puppet.

@aborrero yep that should work.

Potentially a race condition there if we drive the DNS from Netbox, which will only get the IPs created after the first puppet run (and import of puppet networking facts back to Netbox, after which we could set the DNS on them and run the dns cookbook).

Change 914317 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloud_private_subnet: support using FQDNs instead of hardcoded IP addresses

https://gerrit.wikimedia.org/r/914317

The obvious solution is to allow passing of the specific IP to use, and default to $facts['ipaddress'] if that's not present

That sounds like a good approach to me.

I prefer to stick to IPs at the Bird classes level to not add an extra dependency, if any DNS resolution is preferred it must be implemented earlier (eg. in the cloudlb class).

Potentially a race condition there if we drive the DNS from Netbox, which will only get the IPs created after the first puppet run (and import of puppet networking facts back to Netbox, after which we could set the DNS on them and run the dns cookbook).

If I understand correctly CR914317 is a clever way to get IP/network data from Netbox to the host using DNS. Making Netbox the source of truth for those additional interfaces, which is great.
The I/F approach for this is to use the mechanism defined by @jbond in T229397: Puppet: get data (row, rack, site, and other information) from Netbox, the future proof approach here would be to extend it to support this usecase, but time constraints might not make it possible on the short run. In that case going through DNS seems ok.

Regardless of the method, I don't think there is a risk of race condition as (long as) the source of truth is Netbox and the IPs are configured in Netbox ahead of the first Puppet run. By that time the "puppetDB import script" will be a NOOP.

Change 914772 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudlb: fix BGP IP address

https://gerrit.wikimedia.org/r/914772

Change 914317 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloud_private_subnet: support using FQDNs instead of hardcoded IP addresses

https://gerrit.wikimedia.org/r/914317

Change 915476 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudlb: fix BGP IP address

https://gerrit.wikimedia.org/r/915476

Change 914772 merged by Ayounsi:

[operations/puppet@production] profile::bird::anycast: allow setting the BGP IP address from the profile

https://gerrit.wikimedia.org/r/914772

Change 915476 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudlb: fix BGP IP address

https://gerrit.wikimedia.org/r/915476

aborrero claimed this task.

all 3 cloudlb servers should be using the right IP for BGP now:

aborrero@cloudlb2001-dev:~ $ sudo run-puppet-agent
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Retrieving locales
Info: Loading facts
Info: Caching catalog for cloudlb2001-dev.codfw.wmnet
Info: Applying configuration version '(3b9dc78a77) Arturo Borrero Gonzalez - cloudlb: fix BGP IP address'
Notice: /Stage[main]/Bird/File[/etc/bird/bird.conf]/content: 
--- /etc/bird/bird.conf	2023-04-05 14:36:19.921350945 +0000
+++ /tmp/puppet-file20230505-639486-1tbo8mo	2023-05-05 09:45:24.517297618 +0000
@@ -1,6 +1,6 @@
 include "/etc/bird/anycast-prefixes.conf";
 
-router id 10.192.20.8;
+router id 172.20.5.2;
 
 protocol direct {
     interface "*";
@@ -50,7 +50,7 @@
         import none;
         export filter vips_filter;
     };
-    local 10.192.20.8 as 64605;
+    local 172.20.5.2 as 64605;
     neighbor 208.80.153.192 external;
 }
 protocol bgp {
@@ -60,7 +60,7 @@
         import none;
         export filter vips_filter;
     };
-    local 10.192.20.8 as 64605;
+    local 172.20.5.2 as 64605;
     neighbor 208.80.153.193 external;
 }
 

Info: Computing checksum on file /etc/bird/bird.conf
Info: /Stage[main]/Bird/File[/etc/bird/bird.conf]: Filebucketed /etc/bird/bird.conf to puppet with sum 3741c5e02d980142eaf8ea4ea5a4c24a
Notice: /Stage[main]/Bird/File[/etc/bird/bird.conf]/content: content changed '{md5}3741c5e02d980142eaf8ea4ea5a4c24a' to '{md5}922773087a89af71d9bf008a54698446'
Info: /Stage[main]/Bird/File[/etc/bird/bird.conf]: Scheduling refresh of Systemd::Service[bird]
Info: Systemd::Service[bird]: Scheduling refresh of Service[bird]
Info: Systemd::Service[bird]: Scheduling refresh of Systemd::Unit[bird]
Info: Systemd::Unit[bird]: Scheduling refresh of Exec[systemd daemon-reload for bird.service (bird)]
Notice: /Stage[main]/Bird/Systemd::Service[bird]/Systemd::Unit[bird]/Exec[systemd daemon-reload for bird.service (bird)]: Triggered 'refresh' from 1 event
Info: /Stage[main]/Bird/Systemd::Service[bird]/Systemd::Unit[bird]/Exec[systemd daemon-reload for bird.service (bird)]: Scheduling refresh of Service[bird]
Notice: /Stage[main]/Bird/Systemd::Service[bird]/Service[bird]: Triggered 'refresh' from 2 events
Notice: Applied catalog in 23.24 seconds