Page MenuHomePhabricator

Migrate Wikidata Grafana Alerts to Alertmanager
Open, Needs TriagePublic

Description

This is a bit of stub ticket but just added to give some structure. As per T371616#10751382 there's may well be a load of alerts that used to exist on grafana that the Wikidata team was relying on that have now been paused and will not alert. Any of those we wish to keep should move to the alertmanager infrastructure.

Looking in the Wikidata "folder" on https://grafana-rw.wikimedia.org/alerting/list I can see the following paused. A decision should be made for each of these to either sunset or migrate.

  • API: Write modules execution time p95 is above 1 minute (for 2 minutes)
  • Delay injecting Recent Changes, aggregated across client wikis alert
  • Edits: below 30 per minute (for 3 minutes)
  • Max Lag above 10 for 1 hour
  • Number of rows in wb_changes table alert
  • Termbox Request Error alert
  • Wikidata Reliability Metrics - Median loading time alert
  • Wikidata Reliability Metrics - Median Payload alert
  • Wikidata Reliability Metrics - wbeditentity API: executeTiming alert
  • Wikidata Reliability Metrics - wbgetentities API: executeTiming alert

Event Timeline

Tarrow renamed this task from Migrate Grafana Alerts to Alertmanager to Migrate Wikidata Grafana Alerts to Alertmanager.Jun 17 2025, 9:37 AM