Page MenuHomePhabricator

Add stats collection for observability of mjolnir daemons
Closed, ResolvedPublic

Description

Mjolnir daemons are doing things, but we have little to no observability into what it happening. Add some metrics to make it more obvious.

Bulk Daemon:

  • Time to process each batch from kafka
  • Size of batches returned by kafka?
  • Number of reported updates/missing/noop, same as cirrus

MSearch Daemon:

  • Current EMA used to decide if consuming is allowed
  • Current state of MetricMonitor is_below_threshold flag
  • Current FlexibleIterval value which determines how often we collect data from elasticsearch for EMA
  • Time to process each batch from kafka
  • Time to process each bulk request?

Event Timeline

EBernhardson updated the task description. (Show Details)Aug 14 2018, 5:39 PM

Change 451816 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[search/MjoLniR@master] Introduce metrics collection via prometheus

https://gerrit.wikimedia.org/r/451816

Change 451816 merged by jenkins-bot:
[search/MjoLniR@master] Introduce metrics collection via prometheus

https://gerrit.wikimedia.org/r/451816

debt closed this task as Resolved.Aug 24 2018, 4:01 PM