I discovered an issue yesterday when attempting to migrate the IP gateway for analytics1-d-eqiad from our core routers to the Nokia leaf switches in row D.
Anycast GW
The switches in those rows are running EVPN/VXLAN, and use the 'Anycast GW' function. This means that every switch in the row has the same IP configured on its irb0.1023 interface, which is what hosts on the vlan use as their default gateway.
To make this work the Anycast GW IRB interface needs to use the same MAC address for the anycast IP on every switch. That allows for things like VM mobility, and ensures any ARP/ND response from any switch will be consistent.
Problem
This is slightly more complicated when IPv6 router-advertisements are used. RAs have an optional field - the "Source link-layer address" option. If this is not present then the host, on receipt of an RA, will use normal neighbor discovery to find the MAC of the IP that sent the RA. And for normal neighbor discovery the Nokia switches do what they should, the anycast mac is returned every time:
set / interface irb0 subinterface 1023 anycast-gw anycast-gw-mac 12:00:00:00:10:23
root@sretest1006:~# ip -6 route show default default via fe80::1000:ff:fe00:1023 dev ens2f0np0 proto ra metric 1024 expires 598sec hoplimit 64 pref medium
root@sretest1006:~# while true; do ndisc6 fe80::1000:ff:fe00:1023 ens2f0np0 | grep Target; sleep 1; done Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23 Target link-layer address: 12:00:00:00:10:23
The issue is the MAC in the ""Source link-layer address" section of the RAs they send is different:
ICMPv6 Option (Source link-layer address : a8:e5:ec:78:4f:3c)
Type: Source link-layer address (1)
Length: 1 (8 bytes)
Link-layer address: a8:e5:ec:78:4f:3c (a8:e5:ec:78:4f:3c)When a host receives one of these RAs it will update its ND cache with the MAC address from this field (normal behaviour as per RFC4861 6.3.4). Given there are 8 switches with this vlan configured, all sending RAs, that means hosts are constantly updating the MAC for their gateway IP with different values:
cmooney@an-druid1005:~$ sudo ip -ts monitor | grep fe80::1000:ff:fe00:1023 | grep REACHABLE [2026-03-19T21:55:16.267418] fe80::1000:ff:fe00:1023 dev eno1 lladdr 12:00:00:00:10:23 router REACHABLE [2026-03-19T21:55:44.427402] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:4f:3c router REACHABLE [2026-03-19T21:56:13.355421] fe80::1000:ff:fe00:1023 dev eno1 lladdr 12:00:00:00:10:23 router REACHABLE [2026-03-19T21:56:37.419366] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:69:3c router REACHABLE [2026-03-19T21:57:03.275407] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:57:3c router REACHABLE [2026-03-19T21:57:12.235372] fe80::1000:ff:fe00:1023 dev eno1 lladdr 12:00:00:00:10:23 router REACHABLE [2026-03-19T21:57:34.251409] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:59:3c router REACHABLE [2026-03-19T21:58:02.667361] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:73:3c router REACHABLE [2026-03-19T21:58:17.771407] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:59:3c router REACHABLE [2026-03-19T21:58:41.323341] fe80::1000:ff:fe00:1023 dev eno1 lladdr 12:00:00:00:10:23 router REACHABLE [2026-03-19T21:58:56.939420] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:69:3c router REACHABLE [2026-03-19T21:59:11.275415] fe80::1000:ff:fe00:1023 dev eno1 lladdr a8:e5:ec:78:41:3c router REACHABLE
Wonky routing
The result is hosts keep changing what switch they are using as default gateway. It seems to be the top-of-rack ~65% of time, but it keeps switching between them:
cmooney@an-druid1005:~$ mtr -b -w -c 1000 -6 cr1-eqiad.wikimedia.org
Start: 2026-03-19T19:54:02+0000
HOST: an-druid1005 Loss% Snt Last Avg Best Wrst StDev
1.|-- irb0-1023.lsw1-d3-eqiad.eqiad.wmnet (2620:0:861:108::4) 0.1% 1000 0.3 0.2 0.2 1.0 0.1
irb0-1023.lsw1-d8-eqiad.eqiad.wmnet (2620:0:861:108::8)
irb0-1023.lsw1-d7-eqiad.eqiad.wmnet (2620:0:861:108::7)
irb0-1023.lsw1-d4-eqiad.eqiad.wmnet (2620:0:861:108::5)
irb0-1023.lsw1-d6-eqiad.eqiad.wmnet (2620:0:861:108::6)
irb0-1023.lsw1-d2-eqiad.eqiad.wmnet (2620:0:861:108::3)
irb0-1023.lsw1-d1-eqiad.eqiad.wmnet (2620:0:861:108::2)
2.|-- lo50.ssw1-d1-eqiad.eqiad.wmnet (2620:0:861:130::1) 4.8% 1000 0.5 0.3 0.3 1.2 0.1
3.|-- cr1-eqiad.wikimedia.org (2620:0:861:ffff::1) 0.0% 1000 0.8 1.1 0.5 34.5 3.3Fix
I couldn't find any configuration option that would seem to support this. Tbh we should have spotted it before during testing, but we had less switches in the test setup (only two) and must have missed it.
I will open a ticket with Nokia about it but it looks like they don't support using IPv6 RAs with Anycast GW. Hopefully they can make a change so that if Anycast GW is configured they either omit the "source link address" field in RAs or send the Anycast MAC.