Page MenuHomePhabricator

Peering: prefer primary IXP for direcly connected networks
Closed, ResolvedPublic

Description

See

In T280054: BGP: prioritize directly connected peers I introduced a new rule/community to prioritize directly connected networks. However I didn't take into account the primary IXPs's directly connected networks.

So the current order of priority (see also the doc) is PEERING_ROUTE < PEER_CUSTOMER < PEERING_ROUTE_PRIMARY < PEER_PRIVATE_PEER < DIRECT_PEER

For sites where we have a primary and secondary IXP, not directly connected networks (path length > 1) are properly forced on the primary IXP (when paths learned from both IXPs) as they're tagged with PEERING_ROUTE_PRIMARY.
But directly connected networks (path length = 1) are tagged with the same DIRECT_PEER when learned from both IXPs and thus tiebreak is done using other criteria (like peer uptime).

This cancels the effect of PEERING_ROUTE_PRIMARY.

I see 2 possible fixes:

  1. Rollback the change done in T280054 (remove DIRECT_PEER) - helps keep a leaner config - not ideal as it means sub-optimal routing
  2. Introduce another criteria (DIRECT_PEER_PRIMARY) with a local-pref of 285, applied on prefixes learned from the primary IXPs - prefered option as it solves the issue, at the cost of slightly more complex config

Last "3rd" option is to keep the current behavior. Not ideal neither.

If we go with option 2 we need to watch for traffic shift and risk of saturation.

We could also use this task to use as-path-calc-length as mentioned in {T280054#7018736} as the routers are now running a more recent code version.

Event Timeline

ayounsi triaged this task as High priority.Jun 6 2023, 6:16 AM
ayounsi created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Thanks for this one. While not ideal I think probably option 2 / adding DIRECT_PEER_PRIMARY is gonna be best.

Is getting a little complex, but at least it’s consistent everywhere, and best to have the most optimal routing.

Change 929301 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/homer/public@master] Prioritize direct peers connected to primary IXP

https://gerrit.wikimedia.org/r/929301

Change 929301 merged by jenkins-bot:

[operations/homer/public@master] Prioritize direct peers connected to primary IXP

https://gerrit.wikimedia.org/r/929301

ayounsi claimed this task.

Tested in eqsin, traffic is now balanced more equally between all 3 IXPs. Same for ulsfo.