Page MenuHomePhabricator

prometheus: slow dashboards due to suboptimal query_range performance
Closed, ResolvedPublic

Description

We have ported several dashboards from graphite to prometheus, including Aggregate Client Status Code. While the graphite version loads quickly and feels snappy, the prometheus one is quite slow.

While inspecting the situation using chrome developer tools, I've noticed that certain query_range requests to the prometheus API have a TTFB between 7 and 11 seconds.

See for example this query.

My initial thought was to write some aggregation rules to speed things up, but the dashboard is already using job_method_status:varnish_requests:rate5m.

Event Timeline

ema created this task.Mar 29 2018, 8:23 AM
Restricted Application added a project: Operations. · View Herald TranscriptMar 29 2018, 8:23 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ema triaged this task as Normal priority.Mar 29 2018, 8:23 AM
ema moved this task from Triage to Watching on the Traffic board.Mar 29 2018, 8:25 AM
Volans added a subscriber: Volans.Apr 1 2019, 3:20 PM

@ema given the speedup due to prometheus 2 do you think this still needs to be worked on or could be resolved?

ema closed this task as Resolved.Apr 1 2019, 4:00 PM
ema claimed this task.

@ema given the speedup due to prometheus 2 do you think this still needs to be worked on or could be resolved?

Ah yes, thanks for the ping. Closing.