Our BGP confederation peerings and policies have been a bit inconsistent since we set them up and we haven't invested much into them since the original deployment.
- The first major issue is that we haven't really thought through the tradeoffs between doing multihop BGP peerings or not (with an almost arbitrary/hard to calculate max hop), between loopbacks or neighboring interfaces, (almost) meshed or between adjacent routers, next-hop-self or not. There are pros and cons with each and I don't believe we are consistent right now.
- The second issue is that our aggregates between sites need to be cleaned up a little bit to at least establish proper boundaries (e.g. each site's private IP space on cr* only with protocol direct, each site's private mgmt space on mr1* only with protocol direct).
- After that is done, we may or may want to consider splitting our IGP to one per subAS -- there are pros and cons with each of these.
The above two issues would help with rerouting/link recovery/packet loss in the case of various fiber cuts between our US-wide network (also see T167306).
- Finally, the more user-visible issue that we have right now is that we're underutilizing eqord: we currently do not announce our supernets from eqord. The reason for this is that I hadn't found an easy way to guarantee that it wouldn't be announced if both eqiad<->eqord and eqord<->codfw was down, but eqord<->ulsfo and ulsfo<->codfw was up. The only solution that I could think of was splitting eqord in its own subAS and then doing a cross-subAS import policy with a ^65001 regexp.