Page MenuHomePhabricator

Update ResourceLoader dashboard to query varnishrls data from Prometheus instead
Closed, ResolvedPublic

Description

Per T184942#4083351 and T184942#4087564, I think we now have all the data we need in Prometheus.

There is also the https://grafana.wikimedia.org/dashboard/db/resourceloadermodule dashboard but it only uses MediaWiki/ResourceLoader internal metrics that aren't affected by the move the Prometheus right now. Only the Varnish metrics are being moved, used by the main and alert dashboards.

Event Timeline

Krinkle updated the task description. (Show Details)
Krinkle raised the priority of this task from Low to Medium.May 4 2018, 3:17 AM

Change 432090 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/puppet@production] prometheus: Add varnishrls aggregation rules

https://gerrit.wikimedia.org/r/432090

Change 432090 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: Add varnishrls aggregation rules

https://gerrit.wikimedia.org/r/432090

@fgiunchedi Thanks. This makes the per-dc stacks much easier without the need to iterate over each data source. Interestingly though, while I still need the aggregated rules for more complex graphs, I did find a way to make the simpler graphs work without the global rules. Namely, Grafana supports a way to mix multiple data sources in a single graph:

Screen Shot 2018-05-12 at 02.40.03.png (704×2 px, 425 KB)

Global (easy!)
Screen Shot 2018-05-12 at 02.40.23.png (342×1 px, 62 KB)
Mixed (not impossible!)
Screen Shot 2018-05-12 at 02.40.14.png (642×1 px, 117 KB)

Change 432712 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[operations/puppet@production] mtail: Add xcachestatus to varnishrls

https://gerrit.wikimedia.org/r/432712

@fgiunchedi Thanks. This makes the per-dc stacks much easier without the need to iterate over each data source. Interestingly though, while I still need the aggregated rules for more complex graphs, I did find a way to make the simpler graphs work without the global rules. Namely, Grafana supports a way to mix multiple data sources in a single graph:

Nice! Thanks that's good to know Grafana can do that

Change 432712 merged by Ema:
[operations/puppet@production] mtail: Add xcachestatus to varnishrls

https://gerrit.wikimedia.org/r/432712

This is now done. All frontend panels on https://grafana-admin.wikimedia.org/dashboard/db/resourceloader and https://grafana-admin.wikimedia.org/dashboard/db/resourceloader-alerts use Prometheus now. I've also added a datacenter site breakdown option to the main RL dashboard.