Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Add looking glass CNAMEs | operations/dns | master | +4 -1 | |
acme_chief: Issue birdlg certificate | operations/puppet | production | +7 -0 | |
Bird-lg | operations/puppet | production | +455 -0 |
Related Objects
Event Timeline
That would be nice indeed, but needs a bit of work to do it properly:
- Opening up a web interface to our routers directly like many others do is a bad idea IMO, as it opens up an attack vector.
- We could set up a BGP instance in a separated server (e.g. using bird-lg) and peer with our routers. *However*, due to the fact that we span over two ASNs one of which is a confederation withs 2-3 subASes, we'd need to set up multiple instances which I'm guessing it complicates things. It needs further research to see how easy it would be (help welcome :))
In the meantime, unlike most ASNs, we are generally responsive, reachable over IRC (both on our channels and other well-known networker channels) and we have a public bug tracker ;)
After looking at the various looking glass, bird-lg seems indeed the best option (doesn't need ssh access to the routers, open-source, user-friendly, supports multiple regions).
That's why I setup a POC at https://af-lg.wmflabs.org/ This only includes esams.
It consists of 4 main parts:
- cr2-esams peering with bird (with next hop self)
- A bird daemon receiving the full BGP view
- lgproxy.py that talks to a single bird daemon, each "region" needs its own lgproxy.py. It's also the script than runs the traceroutes.
- lg.py the web interface that relays queries to lgproxy.py
A few current caveats:
- As bird/bird-lg is running in eqiad, the traceroutes originate from eqiad. For an optimal deployment, we would need to have a bird/lgproxy.py instance in each region. Is there a host we could use for that?
- The routes add the mention "via 10.68.16.1 on eth0"as it's the route bird uses to reach the next hop "BGP.next_hop: 91.198.174.244" as it adds confusion, we could remove it from the code, like what other do: http://lg.as5580.net/prefix_detail/all/ipv4?q=172.217.3.163
Following steps if we want to move forward would be to write init scripts for lg.py and lgproxy.py, puppetize bird/birdlg/apache, and have all routers peer with it.
Change 390330 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] [WIP] Bird-lg
Gerrit change 390330 is up for reviews. @faidon ? or anyone else?
It will then need to be deployed on netmon1002/2001
Note that we peer with RIPE RIS collectors in out POPs, so people can use https://stat.ripe.net/widget/looking-glass as a looking glass.
Change 504233 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Add looking glass CNAMEs
Change 390330 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/puppet@production] Bird-lg
Change 504248 had a related patch set uploaded (by Vgutierrez; owner: Vgutierrez):
[operations/puppet@production] acme_chief: Issue birdlg certificate
Change 504248 abandoned by Ayounsi:
acme_chief: Issue birdlg certificate
Reason:
Not worth pursuing.
Change 504233 abandoned by Ayounsi:
Add looking glass CNAMEs
Reason:
Not worth pursuing.
The amount of work required to properly deploy a (muti-dc) looking glass is, so far, not worth the benefits of having and maintaining one.
- Peering with the RIPE RIS provides a looking glass for 3 of our 5 DCs
- As said by Faidon, we're very quick to reply to NOC email and IRC messages (usually less than 24h)
- No routing issue would have been resolved faster with a looking glass (as far as I know)
Mentioned in SAL (#wikimedia-operations) [2019-11-12T16:28:43Z] <XioNoX> setup bgp session from cr2-codfw to multihop RIS collector - T106056