The WMCS team wants to be paged for some issues found by prometheus instances in the cloud realm. Right now the alertmanager instance in the metricsinfra project can't send pages out. That AM instance is shared by multiple projects, some of which are not maintained by the WMCS team. Options include:
- Give the shared alertmanager instance victorops api keys and let it send pages. Simple, but I guess a bit risky since this instance is shared with some projects that don't have alerting support?
- Make a separate alertmanager instance in the cloud realm, and send pages via it. Bit more secure, but this still has victorops keys in the cloud realm. Silencing UX might be a bit annoying?
- Import information about the paging alerts to some prometheus instance on the production realm, and then alert via that from production alertmanagers
- Let the trusted prometheus instances in cloud talk to prod alertmanagers?
- Something else?