Page MenuHomePhabricator

BGP: prioritize directly connected peers
Closed, ResolvedPublic

Description

As we prioritize peers in that order: transit < major peering < underdog peering
We're running into cases (at least 1) where a directly connected network is not preferred, which is sub-optimal.
For example in eqsin:

182.79.252.0/24    *[BGP/170] 13w3d 20:56:25, localpref 270
                      AS path: 7473 9498 ?, validation-state: unknown
                    > to 103.102.166.145 via ae2.0
                    [BGP/170] 5w1d 20:38:53, localpref 270, from 103.102.166.130
                      AS path: 3491 9498 ?, validation-state: unknown
                    > to 103.102.166.141 via ae0.0
                    [BGP/170] 6w1d 14:52:09, localpref 250, from 27.111.228.122
                      AS path: 9498 ?, validation-state: unknown
                    > to 27.111.228.40 via xe-0/1/3.0

Airtel is routed through 7473 or 3491 before the direct peering.

We could add a term similar to (untested):

+       from {
+           as-path-calc-length 1 equal;
+       }
+       then {
+           local-preference 300;
+       }

So directly connected networks will always be prioritized.

Thoughts?

Event Timeline

ayounsi created this task.

proposal seems fine to me however it would put it theses routes above PEER_INTERNAL which is probably fine but feels wrong

That said Im also curious why PEERING_ROUTE and PEERING_ROUTE_PRIMARY have different prefs to begin with answered (leaving for others) https://phabricator.wikimedia.org/T262517

See all our local-pref in https://wikitech.wikimedia.org/wiki/IP_and_AS_allocations#BGP_communities

I took 300 as an example, even though we don't use PEER_INTERNAL.

Instead I can use 280 for directly connected, and 290 for PEER_INTERNAL.

Instead I can use 280 for directly connected, and 290 for PEER_INTERNAL.

sounds good to me

Change 680980 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/homer/public@master] BGP: prioritize directly connected peers

https://gerrit.wikimedia.org/r/680980

Mentioned in SAL (#wikimedia-operations) [2021-04-20T07:38:17Z] <XioNoX> BGP: prioritize directly connected peers - T280054

Pushed to eqsin and confirmed working as expected:

182.79.252.0/24    *[BGP/170] 00:00:56, localpref 280, from 27.111.228.122
                      AS path: 9498 ?, validation-state: unknown
                    > to 27.111.228.40 via xe-0/1/3.0
                    [BGP/170] 00:00:18, localpref 280, from 27.111.228.123
                      AS path: 9498 ?, validation-state: unknown
                    > to 27.111.228.40 via xe-0/1/3.0
                    [BGP/170] 2w2d 15:40:03, MED 100, localpref 250
                      AS path: 4637 9498 ?, validation-state: unknown
                    > to 27.111.228.4 via xe-0/1/3.0
                    [BGP/170] 14w6d 17:45:31, localpref 50
                      AS path: 1299 9498 ?, validation-state: unknown
                    > to 62.115.148.76 via xe-0/1/1.0

Change 680980 merged by Ayounsi:

[operations/homer/public@master] BGP: prioritize directly connected peers

https://gerrit.wikimedia.org/r/680980

Fun fact: as-path-calc-length is not on Junos 17, while as-path-unique-count is present...

as-path-calc-length would have been better to take AS path prepending into consideration. But the current status quo is to not do so as we for example prefer peering regardless.

Change 681297 had a related patch set uploaded (by Ayounsi; author: Ayounsi):

[operations/homer/public@master] Replace as-path-calc-length with as-path-unique-count

https://gerrit.wikimedia.org/r/681297

Change 681297 merged by jenkins-bot:

[operations/homer/public@master] Replace as-path-calc-length with as-path-unique-count

https://gerrit.wikimedia.org/r/681297

ayounsi claimed this task.

I explored a bit of the data in Turnilo, as we can now filter on community 14907:12.

It's not easy to estimate the gain, it's not null, but most likely quite minimal as well. For example in the top 10 directly connected eqsin talkers those 2 networks benefit from the change:

https://www.peeringdb.com/asn/7552
https://www.peeringdb.com/asn/17639

It also has a side advantages of protecting directly connected networks from BGP hijacking.

Overall, I do think it's worth keeping it. But I don't feel strongly about it.