Page MenuHomePhabricator

Export etcdmirror 'lag' metric and alert on it
Closed, ResolvedPublic

Description

As per title, right now we're alerting on high etcdmirror lag via a regexp-based check on /lag endpoint. Nowadays we must use a metric/prometheus based check instead.

Event Timeline

Change 801642 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/software/etcd-mirror@master] Run isort/black on the codebase

https://gerrit.wikimedia.org/r/801642

Change 801643 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/software/etcd-mirror@master] tox: add formattercheck

https://gerrit.wikimedia.org/r/801643

Change 801644 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/software/etcd-mirror@master] Use etcdmirror namespace for metrics

https://gerrit.wikimedia.org/r/801644

Change 801645 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/software/etcd-mirror@master] Export lag as a Gauge metric

https://gerrit.wikimedia.org/r/801645

Change 801646 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/software/etcd-mirror@master] Port to Python 3.5

https://gerrit.wikimedia.org/r/801646

Change 801642 merged by jenkins-bot:

[operations/software/etcd-mirror@master] Run isort/black on the codebase

https://gerrit.wikimedia.org/r/801642

Change 801643 merged by jenkins-bot:

[operations/software/etcd-mirror@master] tox: add formattercheck

https://gerrit.wikimedia.org/r/801643

Change 801644 merged by jenkins-bot:

[operations/software/etcd-mirror@master] Use etcdmirror namespace for metrics

https://gerrit.wikimedia.org/r/801644

Change 801645 merged by jenkins-bot:

[operations/software/etcd-mirror@master] Export lag as a Gauge metric

https://gerrit.wikimedia.org/r/801645

Change 803871 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/software/etcd-mirror@master] New release 0.0.7-1

https://gerrit.wikimedia.org/r/803871

Change 803871 merged by jenkins-bot:

[operations/software/etcd-mirror@master] New release 0.0.7-1

https://gerrit.wikimedia.org/r/803871

Change 810864 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/software/etcd-mirror@master] rest: fix getLag typo and add test

https://gerrit.wikimedia.org/r/810864

Change 810864 merged by jenkins-bot:

[operations/software/etcd-mirror@master] rest: fix getLag typo and add test

https://gerrit.wikimedia.org/r/810864

Change 810918 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/alerts@master] sre: add etcd-mirror lag page

https://gerrit.wikimedia.org/r/810918

Change 810919 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/puppet@production] etcd: remove paging alert, moved to Prometheus

https://gerrit.wikimedia.org/r/810919

Change 810927 had a related patch set uploaded (by Filippo Giunchedi; author: Filippo Giunchedi):

[operations/puppet@production] wmflib: remove distro conditionals from blackbox http module options

https://gerrit.wikimedia.org/r/810927

Change 810927 merged by Filippo Giunchedi:

[operations/puppet@production] wmflib: remove distro conditionals from blackbox http module options

https://gerrit.wikimedia.org/r/810927

Change 810919 merged by Filippo Giunchedi:

[operations/puppet@production] etcd: remove paging alert, moved to Prometheus

https://gerrit.wikimedia.org/r/810919

Change 810918 merged by Filippo Giunchedi:

[operations/alerts@master] sre: add etcd-mirror lag page

https://gerrit.wikimedia.org/r/810918

fgiunchedi claimed this task.

We have the etcdmirror_lag metric now and pages set up on alertmanager to fire, the icinga check is gone!

Change 801646 abandoned by Alexandros Kosiaris:

[operations/software/etcd-mirror@master] Port to Python 3.5

Reason:

Missed this change, sorry about that. Already done in https://gerrit.wikimedia.org/r/c/operations/software/etcd-mirror/+/812306

https://gerrit.wikimedia.org/r/801646