Page MenuHomePhabricator

Migrate eventlogging check_prometheus checks to alertmanager
Closed, ResolvedPublic

Description

We have been carrying out a migration of all of our existing check_prometheus based checks from Icinga to Alertmanager.
This work has been done as part of: T293399

This ticket covers the migration of all of our (legacy) Eventlogging checks to Alertmanager.

These alerts are defined in https://github.com/wikimedia/puppet/blob/production/modules/profile/manifests/prometheus/alerts.pp#L37-L98

  • eventlogging_EventError_throughput
  • eventlogging_NavigationTiming_throughput
  • eventlogging_throughput
  • eventlogging_processors_kafka_lag

Event Timeline

Change 902364 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/alerts@master] Move event logging checks from Icinga to alertmanager

https://gerrit.wikimedia.org/r/902364

Change 902454 had a related patch set uploaded (by Cathal Mooney; author: Cathal Mooney):

[operations/puppet@production] Remove Eventlogging prometheus-based Icinga checks

https://gerrit.wikimedia.org/r/902454

Change 902364 merged by jenkins-bot:

[operations/alerts@master] Move event logging checks from Icinga to alertmanager

https://gerrit.wikimedia.org/r/902364

Mentioned in SAL (#wikimedia-analytics) [2023-03-24T14:43:13Z] <topranks> merged alertmanager rules for eventlogging checks being migrated from Icinga T309007

Change 902454 merged by Cathal Mooney:

[operations/puppet@production] Remove Eventlogging prometheus-based Icinga checks

https://gerrit.wikimedia.org/r/902454

cmooney updated the task description. (Show Details)