Polish script that checks eventlogging lag to use it for alarming
Description
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Declined | None | T124306 Polish script that checks eventlogging lag to use it for alarming | |||
Resolved | Ottomata | T124307 Improve eventlogging replication procedure | |||
Resolved | None | T125135 Add autoincrement id to EventLogging MySQL tables. {oryx} | |||
Resolved | Ottomata | T161855 Drop tables with no events in last 90 days. |
Event Timeline
Comment Actions
I took a look to the script and it would be really great to push metrics to statsd about the lag observed for each table. After that alarming with graphite/icinga should be super easy.
The new metrics would be added in https://grafana.wikimedia.org/dashboard/db/eventlogging
This comment was removed by elukey.
Comment Actions
We postponed this too far, we're likely to change how people look at EL data before we get to this improvement.