Background
There are several outstanding questions relating to WAN configuration between WMF POPs that are ongoing, particularly:
In the recent past we've made at least 2 changes to our protocol configuration to address particular niggles that cropped up, for instance:
T295672 | Use next-hop self on iBGP peerings to deal with dual Equinix IXP ports in eqiad |
T283163 | Copied IGP cost to BGP MED to overcome fact that sub-AS path length is not considered in BGP selection |
From discussing with @ayounsi we both agree that while changes like the above may be required for specific operational issues, we need to minimise any tweaking which is done with only consideration of one issue, or one site. It is risky to adjust global protocol parameters without thoroughly considering the wider effects. Additionally we don't want our protocol config/policy to be a complex beast created through a process of multiple minor tweaks over time.
Objective
Creating this task to track progress on a holistic review of our WAN configuration, with a goal of producing a new agreed design for the medium/long term.
To be clear the process may result in very few changes, our current setup might be deemed best. Certainly minimising change and disruption is good, and "change for change's sake" is not what this task is about. Nevertheless there may be improvements we could make.
Scope
To list off some random ideas to give a sense of what we could do:
- Devise a new methodology for setting OSPF link costs, to aid expressing intent in terms of path selection.
- Remove OSPF across the WAN, and replace Confed configuration with full eBGP between sites.
- Thus making all inter-site routing decisions a matter of BGP policy.
- Keep the OSPF, but still move to eBGP, with next-hops preserved.
- Bringing AS-path length into consideration when selecting BGP paths.
- Move all networks, apart from link and loopback addresses, from OSPF to BGP, and use loopback addresses only as next-hops.
- This may warrant introducing BGP multipath to properly load balance towards two remote CRs announcing the same prefix.
- Keep OSPF, but move the WAN transport to MPLS, likely SR-MPLS, to enable traffic-engineering of WAN paths according to policy.
There are many more things we might consider, the above are only off-the-cuff ideas to give a sense of the scope of the task. We need to arrive at a design that provides deterministic and predictable path selection, and gives us sufficient control. That said the paramount consideration should remain simplicity, we need to avoid adding too many layers of complexity to add "nerd knobs" to play with.