Page MenuHomePhabricator

Metrics for TimeoutError
Closed, ResolvedPublic

Description

See https://github.com/wiki-ai/ores/blob/master/ores/score_processors/score_processor.py#L96 We could catch a TimeoutError here report metrics on it.

Event Timeline

Does this task entail just implementing the timeout error, or integrating it into grafana as well?

We have a TimeoutError. We'll need a statsd metric and then a grafana graph too.

@Halfak could you explain the purpose of the count parameter found here and elsewhere?

Yes. count represents the number of scores that errored. See an implementation of this interface for an example of how it is used. E.g. https://github.com/wiki-ai/ores/blob/master/ores/metrics_collectors/logger.py#L42 or https://github.com/wiki-ai/ores/blob/master/ores/metrics_collectors/statsd.py#L104

I don't see where it is actually used though. In everything I can find, it just uses the default value.

You're right. It looks like the code that uses metrics_collector longer needs count.