The planned move from 4x10G links between cr1-eqiad and asw2-d-eqiad has now been attempted twice, and we have had problems on both occasions.
The first attempt was documented in the incident report. Following that another attempt was made to perform the move today, Oct 11th, taking greater care to double-check interface status, ensure rollback was ready etc.
Once again the same issue was encountered. While the physical interfaces were up and looked healthy both sides, MAC addresses being learnt on the switch etc, once the IP addressing was added to each vlan sub-int on the router problems were observed.
Specifically it seems that traffic destined for directly connected hosts on the row D vlans was not making it from CR1 to the hosts. Outbound traffic from hosts in row D still worked, as cr2-eqiad remained VRRP master / gateway for each subnet. But traffic destined for the hosts would not make it if that traffic routed via cr1-eqiad (due to how the routing works some traffic for the subnets would route via cr2-eqiad, and thus also work in that direction).
A few initial checks were done this evening to try to isolate where the problem was.
Test 1: cr1-eqiad to row D host using test subnet
We configured IP 198.18.0.1/30 (from RFC2544 test range) on cr1-eqiad, ae4.1004 (public1-d-eqiad). And then added 198.18.0.2/30 as secondary IP to virtual machine doh1002, which is on that vlan (not interfering with its primary IP of course).
With this in place we could ping fine from cr1-eqiad to doh1002 on the test IPs. This tells us that the new 40G link, optics and switch config etc. are all fine and traffic should be able to flow over the interface to hosts.
Test 2: cr1-eqiad to host check
As things appeared to work with a test / new subnet we wanted to use IP addressing that was part of our production range, to see if the specific ranges in question were part of the problem.
To do this we assigned unused IP address 10.64.48.4/31 to ae4.1020 of cr1-eqiad. This is a smaller network that overlaps with the 10.64.48.0/22 range on the private1-d-eqiad subnet. Neither of the IPs in the /31 were already in use, making it safe to apply.
With this in place 10.64.48.5/31 was configured on sretest1001. With both of these up comms between the devices worked fine, we could ping 10.64.48.5 from 10.64.48.4 on cr1.
After this was done the 10.64.48.5 IP was added as a secondary IP on cr2-eqiad ae4.1020. When this IP was on cr2-eqiad pings could be made between cr1-eiqad (10.64.48.5) and cr2-eqiad (10.64.48.4) without problem.
Test 3: add unicast IP from private1-d-eqiad to cr1-eqiad, but don't enable VRRP
To verify if the issue was related to participation in VRRP (cr1 would not have taken over as master regardless, but in case this was somehow happening), the normal unicast IP of cr1 on this subnet, 10.64.48.2/22 was added to ae4.1020.
As soon as this config was applied problems again became apparent, so a quick rollback was done. Anticipating potential problems several pre-prepared checks were done while the config was on cr1, however, which shed some light on the problem state.
- Observation 1: CR1 appears to send ARP requests for hosts on the subnet, but does not report any responses from those hosts.
When the connected subnet is added to cr1 we expect it will send ARP requests for host IPs on the same subnet. Running a monitor traffic command while the problem was present showed that cr1 was apparently generating such requests:
cmooney@re0.cr1-eqiad> monitor traffic interface ae4.1020 no-resolve layer2-headers matching arp Oct 11 17:50:36 verbose output suppressed, use <detail> or <extensive> for full protocol decode Address resolution is OFF. Listening on ae4.1020, capture size 96 bytes 17:50:36.098478 In 84:18:88:0d:df:c9 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 60: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.41 tell 10.64.48.3 17:50:39.481942 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.2 tell 10.64.48.2 17:50:39.489991 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.105 tell 10.64.48.2 17:50:39.490115 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.230 tell 10.64.48.2 17:50:39.490198 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.229 tell 10.64.48.2 17:50:39.490269 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.64 tell 10.64.48.2 17:50:39.490394 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.21 tell 10.64.48.2 17:50:39.490532 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.52 tell 10.64.48.2 17:50:39.490710 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.106 tell 10.64.48.2 17:50:39.490801 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.50 tell 10.64.48.2 17:50:39.490868 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.113 tell 10.64.48.2 17:50:39.491016 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.95 tell 10.64.48.2 17:50:39.510238 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.131 tell 10.64.48.2 17:50:39.510351 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.225 tell 10.64.48.2 17:50:39.510374 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.210 tell 10.64.48.2 17:50:39.510393 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.83 tell 10.64.48.2 17:50:39.510411 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.96 tell 10.64.48.2 17:50:39.510429 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.14 tell 10.64.48.2 17:50:39.510446 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.82 tell 10.64.48.2 17:50:39.510464 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.78 tell 10.64.48.2 17:50:39.510481 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.132 tell 10.64.48.2 17:50:39.510577 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.216 tell 10.64.48.2 17:50:39.510673 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.84 tell 10.64.48.2 17:50:39.510977 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.217 tell 10.64.48.2 17:50:39.511000 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.13 tell 10.64.48.2 17:50:39.511112 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.220 tell 10.64.48.2 17:50:39.511133 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.219 tell 10.64.48.2 17:50:39.511150 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.213 tell 10.64.48.2 17:50:39.511168 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.223 tell 10.64.48.2 17:50:39.511185 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.221 tell 10.64.48.2 17:50:39.511202 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.202 tell 10.64.48.2 17:50:39.511219 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.199 tell 10.64.48.2 17:50:39.511352 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.195 tell 10.64.48.2 17:50:39.511463 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.193 tell 10.64.48.2 17:50:39.511492 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.103 tell 10.64.48.2 17:50:39.511673 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.109 tell 10.64.48.2 17:50:39.511704 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.206 tell 10.64.48.2 17:50:39.511843 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.25 tell 10.64.48.2 17:50:39.511888 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.228 tell 10.64.48.2 17:50:39.511908 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.66 tell 10.64.48.2 17:50:39.512081 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.129 tell 10.64.48.2 17:50:39.512104 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.19 tell 10.64.48.2 17:50:39.512122 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.31 tell 10.64.48.2 17:50:39.512215 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.226 tell 10.64.48.2 17:50:39.512238 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.214 tell 10.64.48.2 17:50:39.512288 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.222 tell 10.64.48.2 17:50:39.512372 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.91 tell 10.64.48.2 17:50:39.512456 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.176 tell 10.64.48.2 17:50:39.512479 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.137 tell 10.64.48.2 17:50:39.512496 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.204 tell 10.64.48.2 17:50:39.512777 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.104 tell 10.64.48.2 17:50:39.513019 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.33 tell 10.64.48.2 17:50:39.514103 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.92 tell 10.64.48.2 17:50:39.514237 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.154 tell 10.64.48.2 17:50:39.514257 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.101 tell 10.64.48.2 17:50:39.514275 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.85 tell 10.64.48.2 17:50:39.514292 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.93 tell 10.64.48.2 17:50:39.514314 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.70 tell 10.64.48.2 17:50:39.514407 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.35 tell 10.64.48.2 17:50:39.514430 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.86 tell 10.64.48.2 17:50:39.514460 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.90 tell 10.64.48.2 17:50:39.514544 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.56 tell 10.64.48.2 17:50:39.514566 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.201 tell 10.64.48.2 17:50:39.514588 Out 5c:5e:ab:3d:87:c4 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1020, p 0, ethertype ARP, arp who-has 10.64.48.215 tell 10.64.48.2 <-- rest omitted -->
What is interesting to observe here is that there are no "In" packets with arp replies from any of these hosts. This results in the ARP table for the interface being mostly empty:
cmooney@re0.cr1-eqiad# run show arp no-resolve interface ae4.1020 MAC Address Address Interface Flags 84:18:88:0d:df:c9 10.64.48.3 ae4.1020 none bc:97:e1:c0:17:30 10.64.48.66 ae4.1020 none
Ideally a packet capture / tcpdump would have been done on a connected host to see if the ARPs cr1 reports sending actually made it to end hosts or not. This should definitely be checked in any further tests.
- Observation 2: CR1 responded to ARP requests and pings to 10.64.48.2 when it was configured
For instance using arping from sretest1001 responses were received from cr1-eqiad when it had 10.64.48.2/22 configured on ae4.1020:
ARPING 10.64.48.2 from 10.64.48.138 eno1 Unicast reply from 10.64.48.2 [5c:5e:ab:3d:87:c4] 0.872ms Unicast reply from 10.64.48.2 [5c:5e:ab:3d:87:c4] 0.845ms Sent 2 probe(s) (0 broadcast(s)) Received 2 response(s) (0 request(s), 0 broadcast(s))
- Observation 3: Routing from CR2 to hosts on these subnets was not affected
Testing from lvs1016, on the private1-d-eqiad vlan, it was observed that traffic was not affected to remote hosts which were on vlans using cr2-eqiad as gateway (i.e. cr2 was vrrp master). Traffic to remote subnets that used cr1-eiqad as gateway did not get any response however.
cmooney@lvs1016:~$ sudo traceroute -I -w 1 netmon1002.wikimedia.org traceroute to netmon1002.wikimedia.org (208.80.154.5), 30 hops max, 60 byte packets 1 ae4-1020.cr2-eqiad.wikimedia.org (10.64.48.3) 0.275 ms 0.271 ms 0.302 ms 2 * * * 3 * * * 4 * * * cmooney@lvs1016:~$ cmooney@lvs1016:~$ sudo traceroute -I -w 1 208.80.154.30 traceroute to alert1001.wikimedia.org (208.80.154.88), 30 hops max, 60 byte packets 1 ae4-1020.cr2-eqiad.wikimedia.org (10.64.48.3) 0.512 ms 0.509 ms 0.554 ms 2 alert1001.wikimedia.org (208.80.154.88) 0.189 ms 0.199 ms 0.197 ms
The explanation here is that when traffic routes back for the 10.64.48.0/22 subnet via cr2 it makes it, but when it routes back via cr1 it doesn't. It also confirms that the issue is with cr1 forwarding to hosts on its directly connected interface, and is not related to any routing changes that occur when CR1 announces the ranges via OSPF after they are applied.
Further validating the issue is within CR1. pings from lvs1016 to CR1's loopback interface were observed to stop when the range was applied on cr1-eqiad ae4.1020. i.e. the issue is that cr1-eqiad cannot transmit traffic out to hosts directly on the subnet.
Further Tests
As observed cr1-eqiad does not seem to be able to ARP for hosts on the directly connected networks when the physical links in ae4 are changed from the 4x10G bundle to 1x40G. However it does respond to hosts which send ARP requests to it during this time.
As could be seen above there were certain entries in the ae4.1020 ARP table while the problem was occurring. One theory is that CR1 does properly add entries to the ARP table when it receives an arp request from a host on the subnet, but for whatever reason fails to do so when an end host sends it an arp response following requests it generated itself.
Some of the tests, like pinging with the smaller subnet or test networks, did not appear to work at first for us, but then did. We weren't paying attention to which side initiated pings etc. (and thus arp request vs arp response) but this may be playing a factor. As cr2 is VRRP master on this subnet most hosts will have no reason to send arps for cr1's IP (10.64.48.2), so for it to build the ARP table for these it needs to process responses from end hosts. The few devices it could ping were perhaps because those other devices had first arp'd for it. To be tested further.