Page MenuHomePhabricator

Add Druid as a Private Grafana Datasource
Open, Needs TriagePublic

Description

There is a project that implements a Druid Grafana datasource https://github.com/grafadruid/druid-grafana

Since SRE is increasingly using Turnilo for alert triage I think it'd be worthwhile to explore how feasible it'd be to integrate Druid into our Grafana deployment as well.

With this we could use Grafana to supplement Turnilo and work around some limitations like single visualization at a time, manual-ish query building, awkward copy/paste, etc. We could build dashboards that highlight common scraping patterns, outlying fingerprints, etc. and hopefully help reduce the load on responders and speed up triage.

To do this we would also need to sort through an approach for private Grafana datasources. Grafana supports this, in theory we can limit datasource access to logged-in users with specific group membership via the grafana-rw instance. This should be validated and safeguards/alarms considered to help protect from future mis-configs, etc.

For the purposes of this task lets focus on discussion about the idea and any major concerns. With some support in favor we could spin up subtasks to start working on it.

Event Timeline

RLazarus subscribed.

(Clinic duty here! Apparently a milestone tag, like SRE Observability (FY2025/2026-Q3), is mutually exclusive with the project tag, like SRE Observability, and that means the task shows up on the clinic duty dashboard as "needs triage." I'm adding Observability-Metrics at a guess, because that also takes it off the triage list, but if you'll be using those milestone tags going forward, we may want to adjust the clinic duty dashboard query.)

Considering Turnilo's development has largely stalled, Superset has usability issues regarding on-the-fly data spelunking, this seems like a worthy thing to explore. The plugin allows for quite a lot of flexibility (e.g. SQL queries) so hopefully something can be set up such that Grafana could provide the convenient dashboarding of Superset and the quick spelunking of Turnilo.