Common information
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1003
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDownForLong
- alertname: SystemdUnitDownForLong
- cluster: wmcs
- instance: cloudbackup1003:9100
- job: node
- prometheus: ops
- severity: task
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
Firing alerts
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1003
- description: Unit backup_glance_images.service on node cloudbackup1003 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDownForLong
- summary: The systemd unit backup_glance_images.service on node cloudbackup1003 has been failing for more than two hours.
- alertname: SystemdUnitDownForLong
- cluster: wmcs
- instance: cloudbackup1003:9100
- job: node
- name: backup_glance_images.service
- prometheus: ops
- severity: task
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudbackup1003
- description: Unit backup_vms.service on node cloudbackup1003 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDownForLong
- summary: The systemd unit backup_vms.service on node cloudbackup1003 has been failing for more than two hours.
- alertname: SystemdUnitDownForLong
- cluster: wmcs
- instance: cloudbackup1003:9100
- job: node
- name: backup_vms.service
- prometheus: ops
- severity: task
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source