Page MenuHomePhabricator

Report ok / broken metrics from service_checker
Closed, ResolvedPublic

Description

service_checker is automating the monitoring of end points based on swagger specs. As such, it collects valuable information about uptime and latency as part of its normal operation.

Currently, we only use this information for alerting. It would be interesting to additionally record metrics (in statsd) about uptime and possibly latency per entry point, giving us a finer-grained uptime metric for each monitored entry point.