Today when an ipsec tunnel goes down, a large number of host alerts will fire. This happens because a single icinga check per-host handles multiple tunnels.
We should be able to move to this to a prometheus check. High level we would need to...
- Add ipsec tunnel status metrics to prometheus
- Alert on the aggregate ipsec status metrics (per-site)
- Phase out the host based ipsec checks