Common information
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=labstore1007
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDownForLong
- alertname: SystemdUnitDownForLong
- cluster: wmcs
- instance: labstore1007:9100
- job: node
- prometheus: ops
- severity: task
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
Firing alerts
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=labstore1007
- description: Unit ferm.service on node labstore1007 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDownForLong
- summary: The systemd unit ferm.service on node labstore1007 has been failing for more than two hours.
- alertname: SystemdUnitDownForLong
- cluster: wmcs
- instance: labstore1007:9100
- job: node
- name: ferm.service
- prometheus: ops
- severity: task
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=labstore1007
- description: Unit kiwix-mirror-update.service on node labstore1007 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDownForLong
- summary: The systemd unit kiwix-mirror-update.service on node labstore1007 has been failing for more than two hours.
- alertname: SystemdUnitDownForLong
- cluster: wmcs
- instance: labstore1007:9100
- job: node
- name: kiwix-mirror-update.service
- prometheus: ops
- severity: task
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source