Common information
- alertname: PrometheusSeriesCountAnomalyHigh
- recorder: thanos-rule@main
- severity: task
- source: thanos
- team: o11y
Firing alerts
- dashboard: https://grafana.wikimedia.org/d/taff979/prometheus-tsdb-cardinality-monitoring?orgId=1&from=now-14d&to=now&timezone=utc&var-prometheus=analytics&var-site=codfw
- description: The total number of active series on analytics / codfw has been statistically anomalous for more than 2 hours (z-score: 23.13, threshold: 5). This is a sustained upward deviation from the 3-week baseline, suggesting a cardinality explosion rather than a transient spike. Investigate recently added or modified scrape targets and look for high-cardinality labels.
- runbook: https://wikitech.wikimedia.org/wiki/Prometheus#Runbooks
- summary: Anomalous increase in number of active series on analytics (codfw)
- alertname: PrometheusSeriesCountAnomalyHigh
- prometheus: analytics
- recorder: thanos-rule@main
- severity: task
- site: codfw
- source: thanos
- team: o11y
- Source
- dashboard: https://grafana.wikimedia.org/d/taff979/prometheus-tsdb-cardinality-monitoring?orgId=1&from=now-14d&to=now&timezone=utc&var-prometheus=ops&var-site=eqiad
- description: The total number of active series on ops / eqiad has been statistically anomalous for more than 2 hours (z-score: 8.898, threshold: 5). This is a sustained upward deviation from the 3-week baseline, suggesting a cardinality explosion rather than a transient spike. Investigate recently added or modified scrape targets and look for high-cardinality labels.
- runbook: https://wikitech.wikimedia.org/wiki/Prometheus#Runbooks
- summary: Anomalous increase in number of active series on ops (eqiad)
- alertname: PrometheusSeriesCountAnomalyHigh
- prometheus: ops
- recorder: thanos-rule@main
- severity: task
- site: eqiad
- source: thanos
- team: o11y
- Source