This should include:
- Retrieving metrics on prometheus side (if there's anything missing)
- Add alerts for "down" events - with runbooks
- Add a basic grafana board with the "up/down" metric to add as 'dashboard' to the alerts
- jobs-api (done)
- jobs-emailer