Page MenuHomePhabricator

ProbeDown - Search Platform still getting these tickets, we need this to stop
Closed, ResolvedPublic

Description

Common information

  • alertname: ProbeDown
  • instance: wdqs1015:443
  • job: probes/custom
  • prometheus: ops
  • severity: task
  • site: eqiad
  • source: prometheus
  • team: search-platform

Firing alerts



Event Timeline

bking renamed this task from ProbeDown to ProbeDown - Search Platform still getting these tickets, we need this to stop.Tue, Dec 3, 4:43 PM
bking changed the task status from Open to In Progress.
bking triaged this task as Low priority.

Search platform shouldn't be getting these alerts anyone-Data Platform SRE should be the sole responder. I thought I fixed this in T379182 , but it appears I need to take another look.

bking claimed this task.

I checked the alert files on the Prometheus hosts using cumin:

sudo cumin A:prometheus 'grep search-platform /srv/prometheus/ops/rules/alerts_query_wikidata_org_ldf.yml'

Based on that output, it seems there are no remaining LDF alerts that route to the Search Platform team. That being said, it's always possible that the prometheus services need a restart or reload before the configuration is actually applied. For now, I'm going to assume that everything is fixed and close this ticket. We can always revisit if the alerts come back.