Common information
- alertname: SystemdUnitDown
- cluster: wmcs
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
Firing alerts
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephmon1005
- description: Unit prometheus-node-kernel-messages.service on node cloudcephmon1005 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephmon1005 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephmon1005:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1001
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1001 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1001 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1001:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1002
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1002 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1002 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1002:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1003
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1003 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1003 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1003:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1004
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1004 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1004 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1004:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1006
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1006 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1006 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1006:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1009
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1009 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1009 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1009:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1010
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1010 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1010 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1010:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1011
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1011 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1011 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1011:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1012
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1012 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1012 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1012:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1013
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1013 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1013 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1013:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1015
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1015 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1015 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1015:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1016
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1016 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1016 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1016:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1018
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1018 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1018 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1018:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1019
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1019 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1019 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1019:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1020
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1020 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1020 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1020:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1021
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1021 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1021 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1021:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1022
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1022 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1022 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1022:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1024
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1024 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1024 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1024:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1026
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1026 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1026 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1026:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1029
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1029 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1029 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1029:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1031
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1031 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1031 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1031:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1032
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1032 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1032 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1032:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1033
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1033 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1033 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1033:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1034
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1034 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1034 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1034:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1035
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1035 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1035 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1035:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1037
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1037 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1037 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1037:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1038
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1038 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1038 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1038:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1039
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1039 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1039 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1039:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1040
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1040 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1040 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1040:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudcephosd1041
- description: Unit prometheus-node-kernel-messages.service on node cloudcephosd1041 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudcephosd1041 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudcephosd1041:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudgw1001
- description: Unit prometheus-node-kernel-messages.service on node cloudgw1001 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudgw1001 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudgw1001:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudgw1002
- description: Unit prometheus-node-kernel-messages.service on node cloudgw1002 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudgw1002 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudgw1002:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudlb1002
- description: Unit prometheus-node-kernel-messages.service on node cloudlb1002 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudlb1002 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudlb1002:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudrabbit1001
- description: Unit prometheus-node-kernel-messages.service on node cloudrabbit1001 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudrabbit1001 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudrabbit1001:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudrabbit1003
- description: Unit prometheus-node-kernel-messages.service on node cloudrabbit1003 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudrabbit1003 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudrabbit1003:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudservices1005
- description: Unit prometheus-node-kernel-messages.service on node cloudservices1005 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudservices1005 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudservices1005:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1031
- description: Unit prometheus-node-kernel-messages.service on node cloudvirt1031 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirt1031 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirt1031:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1033
- description: Unit prometheus-node-kernel-messages.service on node cloudvirt1033 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirt1033 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirt1033:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1035
- description: Unit prometheus-node-kernel-messages.service on node cloudvirt1035 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirt1035 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirt1035:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1042
- description: Unit prometheus-node-kernel-messages.service on node cloudvirt1042 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirt1042 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirt1042:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1048
- description: Unit prometheus-node-kernel-messages.service on node cloudvirt1048 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirt1048 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirt1048:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirt1052
- description: Unit prometheus-node-kernel-messages.service on node cloudvirt1052 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirt1052 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirt1052:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirtlocal1002
- description: Unit prometheus-node-kernel-messages.service on node cloudvirtlocal1002 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirtlocal1002 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirtlocal1002:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudvirtlocal1003
- description: Unit prometheus-node-kernel-messages.service on node cloudvirtlocal1003 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudvirtlocal1003 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudvirtlocal1003:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source
- dashboard: https://grafana.wikimedia.org/d/000000377/host-overview?orgId=1&var-server=cloudweb1003
- description: Unit prometheus-node-kernel-messages.service on node cloudweb1003 has been down for long.
- runbook: https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/Runbooks/SystemdUnitDown
- summary: The systemd unit prometheus-node-kernel-messages.service on node cloudweb1003 has been failing for more than two hours.
- alertname: SystemdUnitDown
- cluster: wmcs
- instance: cloudweb1003:9100
- job: node
- name: prometheus-node-kernel-messages.service
- prometheus: ops
- severity: critical
- site: eqiad
- source: prometheus
- state: failed
- team: wmcs
- type: oneshot
- Source