AS7843 connectivity is broken
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	faidon
	Nov 14 2021, 11:35 AM

Description

I was notified that a user in #debian-mirrors reported a connectivity issue to our ftp.us.debian.org mirror (2620:0:861:1:208:80:154:15 aka sodium), for "about a week now".

They provided an mtr but not their IP address. I did not catch them in time to share with them the reporting a connectivity issue page.

However, the information that we have is already enough to pinpoint at least one issue:

The route for the first hop is 2603:6080::/28 and for the subsequent four, 2606:a000::/32, so both fairly broad and with that customer of theirs is probably in there as well.

Both of those routes have 2001:504:0:2::7843:1, as the next-hop, i.e. Charter's router on the Equinix IXP. The routes are learned through the peering that cr2-eqiad (and only cr2-eqiad) has with that IP. So for cr1-eqiad, the source of the route is cr2-eqiad; the 2001:504:0:2::/64 destination, however, is direct, through its own IXP port, xe-3/0/6.

But:

faidon@re0.cr2-eqiad> show ipv6 neighbors |match 2001:504:0:2::7843:1                         
2001:504:0:2::7843:1         2e:21:31:00:2f:9c  reachable   4   yes no      xe-3/3/3.0  
faidon@re0.cr1-eqiad> show ipv6 neighbors |match 2001:504:0:2::7843:1                  
2001:504:0:2::7843:1         none               unreachable 1   no  no      xe-3/0/6.0

sodium's active VRRP gateway is cr1-eqiad.

The report was IPv6-specific and did not mention IPv4. However:

faidon@re0.cr1-eqiad> ping count 2 206.126.238.34 
PING 206.126.238.34 (206.126.238.34): 56 data bytes

--- 206.126.238.34 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

faidon@re0.cr2-eqiad> ping count 2 206.126.238.34    
PING 206.126.238.34 (206.126.238.34): 56 data bytes
64 bytes from 206.126.238.34: icmp_seq=0 ttl=64 time=1.308 ms
64 bytes from 206.126.238.34: icmp_seq=1 ttl=64 time=0.828 ms

--- 206.126.238.34 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.828/1.068/1.308/0.240 ms

(206.126.238.34 being 7843's IPv4 on the IXP)

My guess would be that this is Charter filtering traffic on their IXP port to only routers they have peerings with, for security/anti-DDoS reasons.

I'm not sure if this is because we gave them our router's MAC address when we peered, or if they're doing that by means of ARP/NDP with the IP of the router they peer with. More broadly, our setup right now is "cr2-eqiad has the peering but cr1-eqiad can and will send you traffic", which is probably unusual and breaks network ingress assumptions that exist out there.

Details

	Subject	Repo	Branch	Lines +/-
	Change MAC address in DHCP config for rpki1001	operations/puppet	production	+1 -1

Customize query in gerrit

Related Objects

Mentioned In: rOHPUe35f6e8db0e2: Add policy-statement to CRs which sets next-hop self in iBGP.
T295672: Use next-hop-self for iBGP sessions
Mentioned Here: T295672: Use next-hop-self for iBGP sessions

Event Timeline

faidon triaged this task as High priority.Nov 14 2021, 11:35 AM

faidon created this task.

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 14 2021, 11:35 AM

Maintenance_bot added a project: SRE.Nov 14 2021, 11:45 AM

Mentioned in SAL (#wikimedia-operations) [2021-11-14T11:48:42Z] <paravoid> disable cr1-eqiad:xe-3/0/6 (IXP port) to mitigate T295650

I disabled the Equinix IXP port on cr1-eqiad, xe-3/0/6, just a few moments ago, in order to mitigate this issue. Checked with @ayounsi on IRC first, who is now aware of this task.

Connectivity from sodium to hops 1-5 of their mtr seems to have been restored (previously "address unreachable").

Update 13:50 UTC: the reporting user confirmed over IRC that connectivity has been restored and can now access our Debian mirror.

There is definitely a noticeable difference in traffic patterns from Nov 4th or so:

Screenshot 2021-11-14 at 13-55-31 Turnilo (1 29 0).png (684×992 px, 57 KB)

IPv4 seems to have the opposite pattern (ramping up on Nov 4th), and therefore with IPv4+IPv6 being seemingly unaffected/"normal". This is probably Happy Eyeballs, i.e. users falling back to IPv4 when IPv6 was unreachable (at a performance penalty). So IPv4 traffic was likely unaffected, despite the ping evidence in the task above, for reasons that are not entirely clear to me yet.

RhinosF1 subscribed.Nov 14 2021, 12:49 PM

Thanks for taking care of it. Proper fix is most likely T295672.

Change 738873 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Updating MAC address for install server DHCP config for rpki1001 as it is being rebuilt to provide more disk space and has a new MAC.

https://gerrit.wikimedia.org/r/738873

gerritbot added a project: Patch-For-Review.Nov 15 2021, 11:19 AM

Change 738873 merged by Cathal Mooney:

[operations/puppet@production] Change MAC address in DHCP config for rpki1001

https://gerrit.wikimedia.org/r/738873

Please ignore the above, unrelated CRs. I pasted the wrong task ID when doing the commit.

Maintenance_bot removed a project: Patch-For-Review.Nov 15 2021, 12:10 PM

cmooney claimed this task.Nov 15 2021, 12:24 PM

My guess would be that this is Charter filtering traffic on their IXP port to only routers they have peerings with, for security/anti-DDoS reasons.

I'm not sure if this is because we gave them our router's MAC address when we peered, or if they're doing that by means of ARP/NDP with the IP of the router they peer with. More broadly, our setup right now is "cr2-eqiad has the peering but cr1-eqiad can and will send you traffic", which is probably unusual and breaks network ingress assumptions that exist out there.

100% right I think. Did not expect to see that, as you allude to configuring this filtering based on MAC address learnt in ARP/ND would not be trivial, but perhaps some vendor does this automatically, or with urpf enabled or something.

Either way I think the best solution is to enable the "next-hop self" option as @ayounsi says. Should be relatively straightforward to implement and avoid any such edge-cases caused by both CRs being connected to the same peering LAN in future. Will get that rolled out as part of the dedicated task, then we can revisit this one to re-enable the second IXP and test status of connectivity to Charter.

cmooney mentioned this in rOHPUe35f6e8db0e2: Add policy-statement to CRs which sets next-hop self in iBGP..Nov 15 2021, 1:20 PM

Mentioned in SAL (#wikimedia-operations) [2021-11-18T10:00:06Z] <topranks> Re-enabling Equinix IXP port on cr1-eqiad following iBGP changes to address T295650

The next-hop self policy has been applied on cr1-eqiad and cr2-eqiad, in the Confed_eqiad group, to address this issue.

cr2-eqiad is now announcing the next-hop for all routes learnt on it's Equinix IXP port to cr1-eqiad with it's own loopback IP as next-hop. For instance:

cmooney@re0.cr1-eqiad> show route table inet6.0 receive-protocol bgp 2620:0:861:ffff::2 aspath-regex ".* 11426$" 

inet6.0: 138571 destinations, 751635 routes (138238 active, 1 holddown, 2076 hidden)
Restart Complete
  Prefix		  Nexthop	       MED     Lclpref    AS path
* 2600:5800::/32          2620:0:861:ffff::2   0       250        7843 11426 ?
* 2603:6080::/28          2620:0:861:ffff::2   0       250        7843 11426 ?
* 2603:60bd::/32          2620:0:861:ffff::2   0       250        7843 11426 ?
* 2603:90bb::/32          2620:0:861:ffff::2   0       250        7843 11426 ?
* 2606:a000::/32          2620:0:861:ffff::2   0       250        7843 11426 ?

This ensures that, even now that cr1-eqiad's local port on the Equnix IXP LAN is up, the route for these prefixes on cr1-eqiad still goes via cr2:

cmooney@re0.cr1-eqiad> show route protocol bgp 2603:6080::/28 

inet6.0: 138569 destinations, 636889 routes (138253 active, 0 holddown, 318 hidden)
Restart Complete
+ = Active Route, - = Last Active, * = Both

2603:6080::/28     *[BGP/170] 01:09:59, MED 0, localpref 250, from 2620:0:861:ffff::2
                      AS path: 7843 11426 ?, validation-state: valid
                    > to fe80::8618:88ff:fe0d:dfc5 via ae0.0
                      to fe80::ee38:7300:ce8:9c56 via xe-4/2/2.12

cmooney@re0.cr1-eqiad> show interfaces descriptions | match ae0      
ae0             up    up   cr2-eqiad:ae0

Picking a random Charter IPv4 address I know pings back you can see the traffic routing out via cr2, and ping is successful:

cmooney@re0.cr1-eqiad> ping 24.25.12.99 source 208.80.154.196 
PING 24.25.12.99 (24.25.12.99): 56 data bytes
64 bytes from 24.25.12.99: icmp_seq=0 ttl=248 time=8.048 ms
64 bytes from 24.25.12.99: icmp_seq=1 ttl=248 time=7.496 ms
64 bytes from 24.25.12.99: icmp_seq=2 ttl=248 time=7.675 ms

cmooney@re0.cr1-eqiad> traceroute 24.25.12.99 source 208.80.154.196 no-resolve wait 1
traceroute to 24.25.12.99 (24.25.12.99) from 208.80.154.196, 30 hops max, 52 byte packets
 1  208.80.154.194  0.440 ms  0.505 ms  0.397 ms
 2  206.126.238.34  0.881 ms  0.720 ms  1.159 ms
 3  209.18.43.58  1.184 ms 66.109.5.116  1.181 ms  1.103 ms
 4  66.109.6.225  7.832 ms  9.061 ms 66.109.6.81  8.243 ms
 5  24.93.64.51  8.391 ms  7.708 ms  7.626 ms
 6  * * *
 7  * * *
 8  * * *

cmooney@re0.cr1-eqiad> show route 208.80.154.194 

inet.0: 860569 destinations, 3781957 routes (860097 active, 0 holddown, 2585 hidden)
Restart Complete
+ = Active Route, - = Last Active, * = Both

208.80.154.192/30  *[Direct/0] 63w0d 23:53:47
                    > via ae0.0

	F34746031: Screenshot 2021-11-14 at 13-55-31 Turnilo (1 29 0).png
	Nov 14 2021, 12:05 PM

cr1-eqiad -> Charter/AS7843 connectivity is brokenClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

cr1-eqiad -> Charter/AS7843 connectivity is broken
Closed, ResolvedPublic
Actions