Common information
- dashboard: https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status
- runbook: https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
- alertname: SystemdUnitFailed
- prometheus: ops
- severity: critical
- source: prometheus
- team: collaboration-services
Firing alerts
- dashboard: https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status
- description: backup-restore.service on gitlab1003:9100
- runbook: https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
- summary: backup-restore.service on gitlab1003:9100
- alertname: SystemdUnitFailed
- instance: gitlab1003:9100
- name: backup-restore.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- team: collaboration-services
- Source
- dashboard: https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status
- description: backup-restore.service on gitlab1004:9100
- runbook: https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
- summary: backup-restore.service on gitlab1004:9100
- alertname: SystemdUnitFailed
- instance: gitlab1004:9100
- name: backup-restore.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- team: collaboration-services
- Source
- dashboard: https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status
- description: discard_held_messages.service on lists2001:9100
- runbook: https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
- summary: discard_held_messages.service on lists2001:9100
- alertname: SystemdUnitFailed
- instance: lists2001:9100
- name: discard_held_messages.service
- prometheus: ops
- severity: critical
- site: codfw
- source: prometheus
- team: collaboration-services
- Source
- dashboard: https://grafana.wikimedia.org/d/g-AaZRFWk/systemd-status
- description: stewards_subscriber_data_sync.service on lists2001:9100
- runbook: https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
- summary: stewards_subscriber_data_sync.service on lists2001:9100
- alertname: SystemdUnitFailed
- instance: lists2001:9100
- name: stewards_subscriber_data_sync.service
- prometheus: ops
- severity: critical
- site: codfw
- source: prometheus
- team: collaboration-services
- Source