In preparation for releasing this to "production" for SREs to use during on-call and incident response, we should have some essential monitoring, alerting, and metrics.
Since this is a new service and it's not in production yet, this is probably an excellent introductory task.