Page MenuHomePhabricator

IPv6 for cloud-realm services
Closed, ResolvedPublic

Description

The cloud-realm services should be exposed over IPv6. And by "cloud-realm services" I mean services in the 185.15.56.160/28 and 172.20.255.0/24 networks, so DNS and things behind cloudlb.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
aborrero added a project: User-aborrero.
aborrero moved this task from Backlog to Radar/observer on the User-aborrero board.

Ok so the IPv6 routing is in place in both eqiad and codfw. I just added the gw IP on the switch for cloud-private-b1-codfw a moment ago - 2a02:ec80:a100:205::1/64.

In terms of networks to allocate host routes announced in BGP the following are allocated:

SiteTypeIPv4 RangeIPv6 Range
eqiadPublic service VIPs185.15.56.160/282a02:ec80:a000:4000::/64
eqiadPrivate service VIPs172.20.255.0/242a02:ec80:a000:2ff::/64
codfwPublic service VIPs185.15.57.24/292a02:ec80:a100:4000::/64
codfwPrivate service VIPs172.20.254.0/242a02:ec80:a100:2ff::/64

We will need to create an IPv6 equivalent of the following BGP group on each cloudsw to enable hosts to announce routes from these ranges:

cmooney@cloudsw1-b1-codfw> show configuration routing-instances cloud protocols bgp group cloud_host | display set 
set routing-instances cloud protocols bgp group cloud_host import cloud_server_bgp
set routing-instances cloud protocols bgp group cloud_host family inet unicast prefix-limit maximum 100
set routing-instances cloud protocols bgp group cloud_host export NONE
set routing-instances cloud protocols bgp group cloud_host peer-as 64605
set routing-instances cloud protocols bgp group cloud_host bfd-liveness-detection minimum-interval 300
set routing-instances cloud protocols bgp group cloud_host neighbor 172.20.5.2 description cloudlb2001-dev
set routing-instances cloud protocols bgp group cloud_host neighbor 172.20.5.8 description cloudservices2004-dev
set routing-instances cloud protocols bgp group cloud_host neighbor 172.20.5.9 description cloudservices2005-dev
set routing-instances cloud protocols bgp group cloud_host neighbor 172.20.5.3 description cloudlb2002-dev
set routing-instances cloud protocols bgp group cloud_host neighbor 172.20.5.4 description cloudlb2003-dev

Should be fairly simple to do. Let's get the host side ready first to avoid generating BGP alerts for down neighbors though.

Change #1134694 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] network: Add v6 cloud-private addresses

https://gerrit.wikimedia.org/r/1134694

@taavi also bear in mind for the eqiad side we need an 'aggregate' route similar to this:

cmooney@cloudservices1005:~$ ip route get fibmatch 172.20.4.1
172.20.0.0/16 via 172.20.2.1 dev vlan1152

We also need "ip rules" added for any VIPs we announce, similar to these:

cmooney@cloudservices1005:~$ ip -br -4 addr show dev lo
lo               UNKNOWN        127.0.0.1/8 172.20.255.1/32 185.15.56.162/32 
cmooney@cloudservices1005:~$ 
cmooney@cloudservices1005:~$ ip rule show 
0:	from all lookup local
32764:	from 185.15.56.162 lookup cloud-private
32765:	from 172.20.255.1 lookup cloud-private
32766:	from all lookup main
32767:	from all lookup default

And finally another route table added, named 'cloud-private', which those rules direct lookups to:

cmooney@cloudservices1005:~$ ip route show table cloud-private 
default via 172.20.2.1 dev vlan1152

Puppetization I guess we can just copy & paste from the v4 stuff.

Change #1134694 merged by Majavah:

[operations/puppet@production] network: Add v6 cloud-private addresses

https://gerrit.wikimedia.org/r/1134694

Change #1134699 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:bird: Allow enabling IPv6 without enabling all services on it

https://gerrit.wikimedia.org/r/1134699

Change #1134700 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: Announce OpenStack API over v6 from cloudlb2002-dev

https://gerrit.wikimedia.org/r/1134700

Change #1134699 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:bird: Allow enabling IPv6 without enabling all services on it

https://gerrit.wikimedia.org/r/1134699

What is the expected outcome here in terms of BGP? I think it better we have a BGP session over the IPv6 IP addresses of the hosts and switches, to announce the IPv6 IPs. I should know what this puppetization will create, I guess it's a rinse-and-repeat of what we do for Anycast in prod? If it works great I guess, but we may need to think about a different profile in that puppet class or something if it doesn't do what we need.

Change #1134699 merged by Majavah:

[operations/puppet@production] P:bird: Allow enabling IPv6 without enabling all services on it

https://gerrit.wikimedia.org/r/1134699

Change #1134699 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:bird: Allow enabling IPv6 without enabling all services on it

https://gerrit.wikimedia.org/r/1134699

What is the expected outcome here in terms of BGP? I think it better we have a BGP session over the IPv6 IP addresses of the hosts and switches, to announce the IPv6 IPs. I should know what this puppetization will create, I guess it's a rinse-and-repeat of what we do for Anycast in prod? If it works great I guess, but we may need to think about a different profile in that puppet class or something if it doesn't do what we need.

Once https://gerrit.wikimedia.org/r/c/operations/puppet/+/1134700 is merged, cloudlb2002-dev will try to establish a separate BGP session with cloudsw-b1-codfw over v6 to announce the v6 openstack api VIP.

The 1134699 is mostly meant for eqiad, to allow us to bring the BGP session up and start announcing the first VIPs (the openstack API ones) without having everything immediately over v6 as well as those hosts also announce VIPs for wiki replicas and such.

Change #1134700 merged by Majavah:

[operations/puppet@production] hieradata: Announce OpenStack API over v6 from cloudlb2002-dev

https://gerrit.wikimedia.org/r/1134700

Change #1135018 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] bird: Ensure anycast_healthchecker service is restarted before bird

https://gerrit.wikimedia.org/r/1135018

Change #1135023 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] P:wmcs::cloud_private_subnet: Set correct v6 BGP local address

https://gerrit.wikimedia.org/r/1135023

Change #1135023 merged by Majavah:

[operations/puppet@production] P:wmcs::cloud_private_subnet: Set correct v6 BGP local address

https://gerrit.wikimedia.org/r/1135023

Change #1135018 merged by Majavah:

[operations/puppet@production] bird: Ensure anycast_healthchecker service is restarted before bird

https://gerrit.wikimedia.org/r/1135018

Change #1135455 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] bird: Only specify interface for link-local peerings

https://gerrit.wikimedia.org/r/1135455

Change #1135455 merged by Majavah:

[operations/puppet@production] bird: Only specify interface for link-local peerings

https://gerrit.wikimedia.org/r/1135455

Change #1138850 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/homer/public@master] WMCS: Add policy for IPv6 ranges assigned for server BGP announcement

https://gerrit.wikimedia.org/r/1138850

Change #1138850 merged by jenkins-bot:

[operations/homer/public@master] WMCS: Add policy for IPv6 ranges assigned for server BGP announcement

https://gerrit.wikimedia.org/r/1138850

@taavi we are almost there with this one. Right now the BGP is up to the cloudsw and it's getting the route your sending:

cmooney@cloudsw1-b1-codfw> show route receive-protocol bgp 2a02:ec80:a100:205::3 table cloud.inet6.0                                  

cloud.inet6.0: 18 destinations, 20 routes (18 active, 0 holddown, 0 hidden)
  Prefix		  Nexthop	       MED     Lclpref    AS path
  2a02:ec80:a100:4000::1/128
*                         2a02:ec80:a100:205::3                   64605 I

Traffic from elsewhere gets to the cloudsw and is sent to cloudlb2002:

cmooney@cloudlb2002-dev:~$ sudo tcpdump -i vlan2151 -l -p -nn host 2a02:ec80:a100:4000::1
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on vlan2151, link-type EN10MB (Ethernet), snapshot length 262144 bytes
17:39:29.949023 IP6 2001:bb6:8b70:9e00::187 > 2a02:ec80:a100:4000::1: ICMP6, echo request, id 34079, seq 8, length 64
17:39:30.972807 IP6 2001:bb6:8b70:9e00::187 > 2a02:ec80:a100:4000::1: ICMP6, echo request, id 34079, seq 9, length 64

The problem is the host is trying to send the return traffic via the production realm using the default route, so responses aren't getting back. What we need are some 'ip rules' to tell the system to do routing lookups in the cloud-private table for traffic from those ranges, similar to this for v4:

cmooney@cloudlb2002-dev:~$ ip rule show 
0:	from all lookup local
32764:	from 185.15.57.24 lookup cloud-private
32765:	from 185.15.57.24/29 lookup cloud-private
32766:	from all lookup main
32767:	from all lookup default

There is a defult IPv6 route in the cloud-private table so that bit is ok, we just need the rule(s).

cmooney@cloudlb2002-dev:~$ ip -6 route show table cloud-private 
default via 2a02:ec80:a100:205::1 dev vlan2151 metric 1024 pref medium

At minimum we need a rule for 2a02:ec80:a100:4000::/64, though I wonder if we shouldn't just have 2a02:ec80:a100::/48 there.

Thank you! The IP rule was already in /e/n/i, the puppetization is just done in a way that it required a reboot to apply.

Now the next problem is that HAProxy is only listening on IPv4. That's an easy fix however.

Change #1138960 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] cloudlb: Bind on IPv6 too when no address has been specified

https://gerrit.wikimedia.org/r/1138960

Change #1138960 merged by Majavah:

[operations/puppet@production] cloudlb: Bind on IPv6 too when no address has been specified

https://gerrit.wikimedia.org/r/1138960

Looks like it works now:

taavi@runko:~ $ curl --connect-to ::[2a02:ec80:a100:4000::1] https://keystone.openstack.codfw1dev.wikimediacloud.org/v3
{"version": {"id": "v3.14", "status": "stable", "updated": "2020-04-07T00:00:00Z", "links": [{"rel": "self", "href": "https://keystone.openstack.codfw1dev.wikimediacloud.org/v3/"}], "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}]}}

I've also reserved the IP address in Netbox, but I'm holding off until all cloud-private connected hosts have v6 connectivity before adding the DNS record, in order to ensure those hosts don't use their cloud-hosts v6 connectivity to try to reach that VIP.

Change #1139033 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/dns@master] Add include statement for WMCS service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1139033

I've also reserved the IP address in Netbox, but I'm holding off until all cloud-private connected hosts have v6 connectivity before adding the DNS record,

Ok yeah that makes sense. Yeah adding the DNS record is the only way we can really introduce it. We'll need to merge the above gerrit patch when we add the dns_name in Netbox.

in order to ensure those hosts don't use their cloud-hosts v6 connectivity to try to reach that VIP.

Yeah that would happen for sure if we added it earlier. I guess we should go ahead and assign the IP addresses for all the cloud-private hosts in codfw? How are they typically configured on the end device? Is it from puppet?

in order to ensure those hosts don't use their cloud-hosts v6 connectivity to try to reach that VIP.

Yeah that would happen for sure if we added it earlier. I guess we should go ahead and assign the IP addresses for all the cloud-private hosts in codfw? How are they typically configured on the end device? Is it from puppet?

Yes. The profile::wmcs::cloud_private_subnet profile looks up the cloud-private IP addresses of that host from DNS and assigns them to the VLAN interface if the records are found. Unfortunately that DNS-based lookup makes it rather difficult to do this gracefully. I see two options:

  1. Assign IP addresses and DNS names all in one go and try to have it all applied quickly enough that there's no time for issues to come up.
  2. Implement an alternative way to tell Puppet about the per-host address (Hiera most likely) and use that to add the addresses beforehand.

Yes. The profile::wmcs::cloud_private_subnet profile looks up the cloud-private IP addresses of that host from DNS and assigns them to the VLAN interface if the records are found. Unfortunately that DNS-based lookup makes it rather difficult to do this gracefully. I see two options:

  1. Assign IP addresses and DNS names all in one go and try to have it all applied quickly enough that there's no time for issues to come up.

Is there an issue with a given host having an IPv6 address on that interface? I would have thought that until an AAAA record is returned in DNS the fact there is an address on their interface won't cause a problem?

So we could add all the IPs / DNS names, wait until the hosts all have the address (plus static route) and then publish the new DNS records?

Maybe I'm missing something though.

Yes. The profile::wmcs::cloud_private_subnet profile looks up the cloud-private IP addresses of that host from DNS and assigns them to the VLAN interface if the records are found. Unfortunately that DNS-based lookup makes it rather difficult to do this gracefully. I see two options:

  1. Assign IP addresses and DNS names all in one go and try to have it all applied quickly enough that there's no time for issues to come up.

Is there an issue with a given host having an IPv6 address on that interface? I would have thought that until an AAAA record is returned in DNS the fact there is an address on their interface won't cause a problem?

Probably not. We have a bit of a pattern for provisioning firewall rules based on IP addresses looked up on DNS. So if something doesn't implement happy eyeballs properly then we could have some issues when some hosts have AAAA records and others don't have them yet.

So we could add all the IPs / DNS names, wait until the hosts all have the address (plus static route) and then publish the new DNS records?

As I said, right now the IP address is added to the interface by Puppet after we publish the DNS record. So we need to do a bit more work if we want to separate those two steps.

Probably not. We have a bit of a pattern for provisioning firewall rules based on IP addresses looked up on DNS. So if something doesn't implement happy eyeballs properly then we could have some issues when some hosts have AAAA records and others don't have them yet.

Ah ok. The bit I hadn't accounted for is the fact the hosts talk to each other already over that interface, so obviously they will try to use the AAAA record when available.

We could do something like add the IPs manually before the DNS goes live using cumin or something? So all hosts had the IPs set prior to the DNS records existing?

I think this chicken/egg with the automation should only affect us now when adding it to already working hosts, for new hosts I think the current mechanism should be ok.

Change #1154040 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: Enable BGP on IPv6 for all cloudlb2* hosts

https://gerrit.wikimedia.org/r/1154040

Change #1154040 merged by Majavah:

[operations/puppet@production] hieradata: Enable BGP on IPv6 for all cloudlb2* hosts

https://gerrit.wikimedia.org/r/1154040

Change #1154050 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: Add missing cloud-realm public VIP v6 ranges

https://gerrit.wikimedia.org/r/1154050

Change #1154050 merged by Majavah:

[operations/puppet@production] hieradata: Add missing cloud-realm public VIP v6 ranges

https://gerrit.wikimedia.org/r/1154050

Mentioned in SAL (#wikimedia-cloud) [2025-06-09T13:14:14Z] <taavi> add AAAA record to openstack.codfw1dev.wikimediacloud.org T379282

Change #1139033 merged by Majavah:

[operations/dns@master] Add include statement for WMCS service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1139033

Closing. The infrastructure for these is in place and I've filed separate parent tasks for rolling out v6 support for individual services.

Change #1155181 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: Announce eqiad1 OpenStack API VIP on IPv6

https://gerrit.wikimedia.org/r/1155181

Change #1155181 merged by Majavah:

[operations/puppet@production] hieradata: Announce eqiad1 OpenStack API VIP on IPv6

https://gerrit.wikimedia.org/r/1155181

Change #1155194 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/dns@master] Add include for WMCS eqiad1 service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1155194

Change #1155194 merged by Majavah:

[operations/dns@master] Add include for WMCS eqiad1 service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1155194

Change #1155209 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] hieradata: cloudlb: Add IPv6 listeners for wiki replica endpoints

https://gerrit.wikimedia.org/r/1155209

Change #1155256 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/dns@master] Add include for WMCS eqiad1 private service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1155256

Change #1155256 merged by Majavah:

[operations/dns@master] Add include for WMCS eqiad1 private service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1155256

Change #1164983 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/dns@master] Add include for WMCS codfw private service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1164983

Change #1164983 merged by Majavah:

[operations/dns@master] Add include for WMCS codfw private service VIP reverse IPv6

https://gerrit.wikimedia.org/r/1164983