Page MenuHomePhabricator

prometheus: slow dashboards due to suboptimal query_range performance
Closed, ResolvedPublic

Description

We have ported several dashboards from graphite to prometheus, including Aggregate Client Status Code. While the graphite version loads quickly and feels snappy, the prometheus one is quite slow.

While inspecting the situation using chrome developer tools, I've noticed that certain query_range requests to the prometheus API have a TTFB between 7 and 11 seconds.

See for example this query.

My initial thought was to write some aggregation rules to speed things up, but the dashboard is already using job_method_status:varnish_requests:rate5m.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Medium priority.Mar 29 2018, 8:23 AM

@ema given the speedup due to prometheus 2 do you think this still needs to be worked on or could be resolved?

ema claimed this task.

@ema given the speedup due to prometheus 2 do you think this still needs to be worked on or could be resolved?

Ah yes, thanks for the ping. Closing.