The parent task for this is T347054.
As discussed above, the idea is to replace the current static routes for ns[01] in the core routers and announce the respective IPs via bird instead, using the existing and already established BGP session on the DNS hosts. In doing so, we will help alleviate problems with static routes such as their manual updates, ensuring they are updated between host IP changes (commission and decommission), and absence of peer review for any changes; this will pave the way for automatic management of BGP annoucements and easier depooling during maintenance of the DNS hosts. ns2 is anycasted and is not a part of this discussion and no change is required there.
The current static routes (for ns0 on cr1-eqiad as an example) looks like:
/* ns0 */
route 208.80.154.238/32 {
next-hop [ 208.80.154.6 208.80.154.153 208.80.154.77 ];
readvertise;
no-resolve;
}The hosts above are dns100[4-6].
Automation
To automate this, we need to announce these IPs via birdinstead. Currently, bird.conf includes:
include "/etc/bird/anycast-prefixes.conf";
This is because we have so far only used bird for announcing the anycast addresses but not for the unicast ones, such as 208.80.154.238/32 (ns0), so all our Puppetization and tooling is built around that expectation.
root@dns1004:~# cat /etc/bird/anycast-prefixes.conf
# Generated 2023-10-02 15:31:42.704581 by anycast-healthchecker (pid=1815037)
# 203.0.113.1/32 is a dummy IP Prefix. It should NOT be used and REMOVED from the constant.
define ACAST_PS_ADVERTISE =
[
203.0.113.1/32,
198.35.27.27/32,
10.3.0.1/32,
10.3.0.2/32
];Looking at bird.conf again:
function match_route()
{
return net ~ ACAST_PS_ADVERTISE;
}filter vips_filter {
if ( net.len = 32 && net !~ 203.0.113.1/32 && match_route() ) then {
accept;
}
else {
reject;
}
}In theory, if we can add the ns[01] IPs to ACAST_PS_ADVERTISE above (automatically, via profile::bird::advertise_vips and the associated changes for the healthchecks), that should be enough? Note that we have to customize the Puppetization so that we only add the ns0 IP in the eqiad DNS hosts and ns1 IP in the codfw DNS hosts, as we can't and should not announce them from everywhere. Otherwise, it should be fairly straightforward: we add the VIP the same way we do for the anycast IPs, making sure to customize it to specific sites. That should be fine or am I missing something there?
There is one more change required, in hieradata/common.yaml:
authdns_addrs:
ns0-v4:
address: '208.80.154.238'
ns1-v4:
address: '208.80.153.231'
ns2-v4:
address: '198.35.27.27'
skip_loopback: true # bird::anycast takes care of this oneWe will need to set skip_loopback to ns0-v4 and ns1-v4 as bird will create the loopback IPs. (Or since already created, we can skip those? I will confirm when we work on it.)
I am not sure if we anticipate any additional bird unicast announcements. If not, then the above will work and we can simply customize ACAST_PS_ADVERTISE based on eqiad/codfw. The cleaner approach probably would be to change all tooling to accommodate and make a distinction between the advertisement of unicast and anycast addresses but I am going to leave that for discussion.
Adding @ayounsi and @cmooney for comments and discussion; thanks!