Common information
- dashboard: TODO
- description: The management interface at maps2009.mgmt:22 has been unresponsive for multiple hours.
- runbook: https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
- summary: Unresponsive management for maps2009.mgmt:22
- alertname: ManagementSSHDown
- instance: maps2009.mgmt:22
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: B6
- severity: task
- site: codfw
- source: prometheus
- team: dcops
Firing alerts
- dashboard: TODO
- description: The management interface at maps2009.mgmt:22 has been unresponsive for multiple hours.
- runbook: https://wikitech.wikimedia.org/wiki/Dc-operations/Hardware_Troubleshooting_Runbook
- summary: Unresponsive management for maps2009.mgmt:22
- alertname: ManagementSSHDown
- instance: maps2009.mgmt:22
- job: probes/mgmt
- module: ssh_banner
- prometheus: ops
- rack: B6
- severity: task
- site: codfw
- source: prometheus
- team: dcops
- Source
Impact
Stale maps (codfw) data: This is the primary postgres server of our codfw maps infrastructure. Currently, the sideeffect here is that our maps data on codfw are not being update. In other words, the service is functional but its data is stale.