Page MenuHomePhabricator

Alert for Kafka MirrorMaker lag
Closed, ResolvedPublic8 Estimated Story Points

Description

We should know if MirrorMaker instances are lagging. We do track lag in graphite, so we could use icinga + graphite for this, or we could use Burrow.

Event Timeline

Ottomata created this task.Mar 13 2018, 6:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 13 2018, 6:14 PM
Ottomata triaged this task as Medium priority.Mar 13 2018, 6:14 PM
Ottomata set the point value for this task to 5.
Ottomata moved this task from Incoming to Kafka Work on the Analytics board.Mar 15 2018, 4:35 PM
Ottomata added a project: Analytics-Kanban.
fdans moved this task from Next Up to Paused on the Analytics-Kanban board.Mar 22 2018, 4:46 PM
Ottomata moved this task from Paused to In Progress on the Analytics-Kanban board.Mar 27 2018, 2:51 PM

Change 422163 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Install nrpe check for Kafka consumer lag by checking burrow

https://gerrit.wikimedia.org/r/422163

Change 422163 merged by Ottomata:
[operations/puppet@production] Install nrpe check for Kafka consumer lag by checking burrow

https://gerrit.wikimedia.org/r/422163

Change 422192 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Add mirror_name and host as labels for mirror maker prometheus

https://gerrit.wikimedia.org/r/422192

Change 422192 merged by Ottomata:
[operations/puppet@production] Add mirror_name and host as labels for mirror maker prometheus

https://gerrit.wikimedia.org/r/422192

Change 422201 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Can't set labels on metric without name set

https://gerrit.wikimedia.org/r/422201

Change 422201 merged by Ottomata:
[operations/puppet@production] Can't set labels on metric without name set

https://gerrit.wikimedia.org/r/422201

Change 422230 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Add check_prometheus alerts for Kafka MirrorMaker instances.

https://gerrit.wikimedia.org/r/422230

Change 422230 merged by Ottomata:
[operations/puppet@production] Add check_prometheus alerts for Kafka MirrorMaker instances.

https://gerrit.wikimedia.org/r/422230

Change 422251 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Fix path to check_kafka_consumer_lag nrpe check

https://gerrit.wikimedia.org/r/422251

Change 422251 merged by Ottomata:
[operations/puppet@production] Fix path to check_kafka_consumer_lag nrpe check

https://gerrit.wikimedia.org/r/422251

Change 422258 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Fix prometheus_url for mirror maker alert

https://gerrit.wikimedia.org/r/422258

Change 422258 merged by Ottomata:
[operations/puppet@production] Fix prometheus_url for mirror maker alert

https://gerrit.wikimedia.org/r/422258

Change 422335 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Use scalar in dropped messages prometheus check for mirror maker

https://gerrit.wikimedia.org/r/422335

Change 422335 merged by Ottomata:
[operations/puppet@production] Use scalar in dropped messages prometheus check for mirror maker

https://gerrit.wikimedia.org/r/422335

Ottomata claimed this task.Mar 27 2018, 8:18 PM
Ottomata changed the point value for this task from 5 to 8.
Ottomata moved this task from In Progress to Done on the Analytics-Kanban board.Mar 28 2018, 1:34 PM

Change 422424 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Alert on lag in last 30 minutes, alert mirror maker lag for analytics

https://gerrit.wikimedia.org/r/422424

Change 422424 merged by Ottomata:
[operations/puppet@production] Alert on lag in last 30 minutes, alert mirror maker lag for analytics

https://gerrit.wikimedia.org/r/422424

Change 422430 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Be more lenient about MirrorMaker numDroppedMessages alert

https://gerrit.wikimedia.org/r/422430

Change 422430 merged by Ottomata:
[operations/puppet@production] Be more lenient about MirrorMaker numDroppedMessages alert

https://gerrit.wikimedia.org/r/422430

Change 422467 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] check_kafka_consumer_log - STOP != alert, just bursty topics

https://gerrit.wikimedia.org/r/422467

Change 422467 merged by Ottomata:
[operations/puppet@production] check_kafka_consumer_log - STOP != alert, just bursty topics

https://gerrit.wikimedia.org/r/422467

Change 422939 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Use promethues based alert rather than burrow lag check alert

https://gerrit.wikimedia.org/r/422939

Change 422939 merged by Ottomata:
[operations/puppet@production] Use promethues based alert rather than burrow lag check alert

https://gerrit.wikimedia.org/r/422939

Change 422945 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Fix consuer max lag check query

https://gerrit.wikimedia.org/r/422945

Change 422945 abandoned by Ottomata:
Fix consuer max lag check query

https://gerrit.wikimedia.org/r/422945

Nuria moved this task from Done to In Code Review on the Analytics-Kanban board.Mar 29 2018, 4:04 PM
Ottomata moved this task from In Code Review to Done on the Analytics-Kanban board.Apr 5 2018, 3:41 PM
Nuria closed this task as Resolved.Apr 12 2018, 10:08 PM
Aklapper removed a project: Analytics.Jul 4 2020, 7:59 AM