It could be useful to proactively alert if a linux server had a kernel panic, we could generate a simple prometheus metric, then use alertmanager for that.
Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | None | T209460 CloudVPS: network architecture | |||
| Resolved | • aborrero | T270704 cloud: introduce new edge network architecture for eqiad1 and codfw1dev | |||
| Unknown Object (Task) | |||||
| Resolved | • aborrero | T376589 cloudgw1002: network interface problem | |||
| Resolved | • aborrero | T376719 alerting: detect if a kernel had a panic | |||
| Resolved | • aborrero | T380960 kernel error detector: have a way to ignore certain messages |
Event Timeline
Change #1078684 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] prometheus: add kernel-panic detector
Change #1078684 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] prometheus: add kernel-panic detector
Change #1078922 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/alerts@master] team-wmcs: add kernel panic alerts
Change #1078922 merged by Arturo Borrero Gonzalez:
[operations/alerts@master] team-wmcs: add kernel panic alerts
Change #1078954 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] wmcs: declare prometheus::node_kernel_panic in profile::base::cloud_production
Change #1078954 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] wmcs: declare prometheus::node_kernel_panic in profile::base::cloud_production
Change #1079480 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):
[operations/puppet@production] prometheus-node-kernel-panic.sh: account for buster servers
Change #1079480 merged by Arturo Borrero Gonzalez:
[operations/puppet@production] prometheus-node-kernel-panic.sh: account for buster servers