I noticed PrometheusRuleEvaluationSlow alert for prometheus-k8s envoy rule group.
Sure enough, this group evaluation time exceeds one minute: https://grafana.wikimedia.org/goto/X98HYLpNR?orgId=1
| fgiunchedi | |
| Mar 5 2025, 10:18 AM |
| F58619817: 2025-03-06-155002_1886x1743_scrot.png | |
| Mar 6 2025, 2:50 PM |
| F58608816: 2025-03-05-111834_3776x1772_scrot.png | |
| Mar 5 2025, 10:18 AM |
I noticed PrometheusRuleEvaluationSlow alert for prometheus-k8s envoy rule group.
Sure enough, this group evaluation time exceeds one minute: https://grafana.wikimedia.org/goto/X98HYLpNR?orgId=1
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| prometheus: split envoy rules into separate groups | operations/puppet | production | +9 -1 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | fgiunchedi | T385693 thanos-query overload due to heavy queries | |||
| Resolved | fgiunchedi | T387965 prometheus-k8s envoy rules slow evaluation |
Change #1124743 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):
[operations/puppet@production] prometheus: split envoy rules into separate groups
Change #1124743 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: split envoy rules into separate groups
After splitting in four groups, three of those take about 20s each (the remaining takes 1ms)
Not a "nail in the coffin" solution, good enough for now though