Maniphest T190992

prometheus: slow dashboards due to suboptimal query_range performance
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	• ema
	Mar 29 2018, 8:23 AM

Description

We have ported several dashboards from graphite to prometheus, including Aggregate Client Status Code. While the graphite version loads quickly and feels snappy, the prometheus one is quite slow.

While inspecting the situation using chrome developer tools, I've noticed that certain query_range requests to the prometheus API have a TTFB between 7 and 11 seconds.

See for example this query.

My initial thought was to write some aggregation rules to speed things up, but the dashboard is already using job_method_status:varnish_requests:rate5m.

Related Objects
Search...

Status	Assigned	Task
Resolved	fgiunchedi	T220104 TEC6: Metrics monitoring infrastructure (Q4 2018/19 goal)
Resolved	fgiunchedi	T187987 100% of Prometheus traffic served by Prometheus v2
Resolved	• ema	T190992 prometheus: slow dashboards due to suboptimal query_range performance

Event Timeline

• ema created this task.Mar 29 2018, 8:23 AM

Restricted Application added a project: SRE. · View Herald TranscriptMar 29 2018, 8:23 AM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

• ema triaged this task as Medium priority.Mar 29 2018, 8:23 AM

• ema moved this task from Backlog to Radar/Not for service by Traffic on the Traffic board.Mar 29 2018, 8:25 AM

CDanis mentioned this in T212312: prometheus-based graph significantly slower than statsd equivalent.Jan 9 2019, 3:26 PM

CDanis added a parent task: T187987: 100% of Prometheus traffic served by Prometheus v2.

• Phabricator_maintenance moved this task from Backlog to Acknowledged on the SRE board.Jan 26 2019, 9:53 PM

@ema given the speedup due to prometheus 2 do you think this still needs to be worked on or could be resolved?

In T190992#5074344, @Volans wrote:

@ema given the speedup due to prometheus 2 do you think this still needs to be worked on or could be resolved?

Ah yes, thanks for the ping. Closing.

prometheus: slow dashboards due to suboptimal query_range performanceClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

prometheus: slow dashboards due to suboptimal query_range performance
Closed, ResolvedPublic
Actions

Related Objects
Search...