Page MenuHomePhabricator

[Data Platform] Install a Prometheus connector for Presto, pointed at thanos-query
Closed, ResolvedPublic

Description

Docs: Prometheus connector for Presto

If we did this, we could make Prometheus timeseries available via the presto cli and wmfdata-python in Jupyter, plus we could make this data source available in Superset for very little extra effort.

We could then make dashboards correlating any of the Data Lake sources and Prometheus metrics, which I think would be quite powerful.

An additional use case that could be very powerful is looking for time intervals when some condition was true in the Prometheus metrics, and then use that to perform detailed queries against webrequest.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
BTullis edited subscribers, added: odimitrijevic, nshahquinn-wmf, JAllemandou and 5 others; removed: Aklapper.

Thanks @CDanis. For reference, there was some initial discussion about this functionality here on Slack.

I agree that this cross-correlation of data sources could be very useful to a number of different stakeholders, not least the SRE team.

The presto catalogs are defined in hieradata, both for the coordinator and worker roles.

So for the test cluster it's here for the coordinator and here for the workers. The prod cluster settings are similarly located.

I can't see any governance nor technical reason why we wouldn't want to add this Prometheus catalog to Presto, but I think that we should still seek approval before carrying out any work.

Perhaps @odimitrijevic would be the person to approve any integration. I've added a few other subscribers to invite their opinions and hopefully make sure that we're not barking up the wrong tree.

Also, I believe that @nshahquinn-wmf knows most about wmfdata-python so may be the person best placed to say whether or not Prometheus support would be easy or difficult to add to this library.

This looks super interesting, moving to radar for when we need to help out.

Adding this to our radar as well, to keep an eye when we start querying.

Gehel triaged this task as Medium priority.Oct 11 2023, 8:40 AM
Gehel moved this task from Incoming to Misc on the Data-Platform-SRE board.

Also, I believe that @nshahquinn-wmf knows most about wmfdata-python so may be the person best placed to say whether or not Prometheus support would be easy or difficult to add to this library.

From the Presto client point of view, is it correct that Prometheus would be accessed exactly the same as the Data Lake as now, just with a different catalog argument? If so, Wmfdata wouldn't need any changes to support that as presto.run already has a catalog argument (it currently defaults to "analytics-hive").

Ahoelzl renamed this task from Install a Prometheus connector for Presto, pointed at thanos-query to [Platform] Install a Prometheus connector for Presto, pointed at thanos-query.Oct 20 2023, 4:56 PM
Ahoelzl renamed this task from [Platform] Install a Prometheus connector for Presto, pointed at thanos-query to [Data Platform] Install a Prometheus connector for Presto, pointed at thanos-query.Oct 20 2023, 5:15 PM

@Ahoelzl why was this moved to "Radar (External Teams)" column? Per @BTullis's post, I think this was awaiting DE approval before DE would work on it...?

@CDanis I believe people currently want this kind of work to be planned more strategically, and to prioritize it appropriately I agree this would be very useful!

I'm working on compiling user stories based on the recent data usage at wmf interviews, in which I see you did call this need out! So, it will get a user story which @VirginiaPoundstone and other PMs will be looking at along with many others.

However, I agree Radar isn't quite the right place for this. I'll move it back to backlog and we can re-groom it.

Change #1155278 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Add a prometheus connector for thanos in the test presto cluster

https://gerrit.wikimedia.org/r/1155278

Change #1155278 merged by Btullis:

[operations/puppet@production] Add a prometheus connector for thanos in the test presto cluster

https://gerrit.wikimedia.org/r/1155278

Thank you for working on this -- for my own education, how does this relate (or if it does!) to the work in T390328: Enable querying operational (prometheus) metrics via the WMF Data Platform ?

Change #1156760 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Fix the connector name for the prometheus presto catalog

https://gerrit.wikimedia.org/r/1156760

Change #1156760 merged by Btullis:

[operations/puppet@production] Fix the connector name for the prometheus presto catalog

https://gerrit.wikimedia.org/r/1156760

Change #1156786 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Update the connection parameters for presto->prometheus in test

https://gerrit.wikimedia.org/r/1156786

Change #1156786 merged by Btullis:

[operations/puppet@production] Update the connection parameters for presto->prometheus in test

https://gerrit.wikimedia.org/r/1156786

This is now working in the test presto cluster, so you can start to test this from an-test-client1002
e.g.

btullis@an-test-client1002:~$ hostname -f
an-test-client1002.eqiad.wmnet
btullis@an-test-client1002:~$ presto
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
presto>SELECT value FROM thanos_test.default.ceph_cluster_total_used_bytes WHERE labels['cluster'] = 'cephosd' AND timestamp > (NOW() - INTERVAL '60' second);
       value        
--------------------
 4.5238366683136E13 
(1 row)

Query 20250613_112945_00031_9i42y, FINISHED, 1 node
Splits: 17 total, 17 done (100.00%)
[Latency: client-side: 0:01, server-side: 0:01] [665 rows, 22.5KB] [555 rows/s, 18.7KB/s]

presto>

Each of the tables has the following structure, with the labels being a map(varchar, varchar) type.

presto> describe thanos_test.default.haproxy_backend_active_servers;
  Column   |           Type           | Extra | Comment 
-----------+--------------------------+-------+---------
 labels    | map(varchar, varchar)    |       |         
 timestamp | timestamp with time zone |       |         
 value     | double                   |       |         
(3 rows)

Query 20250613_113153_00034_9i42y, FINISHED, 2 nodes
Splits: 19 total, 19 done (100.00%)
[Latency: client-side: 0:01, server-side: 0:01] [3 rows, 299B] [2 rows/s, 281B/s]

presto>

Change #1156823 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Presto: Add a prometheus connector pointing to thanos

https://gerrit.wikimedia.org/r/1156823

I have a patch to promote the change to production. I can merge this next week, unless anyone objects.

Thank you for working on this -- for my own education, how does this relate (or if it does!) to the work in T390328: Enable querying operational (prometheus) metrics via the WMF Data Platform ?

Good question. I haven't been too involved in the conversation on T390328 yet, but it seems to me that these two tickets have some interesting commonalities.

Firstly, that ticket says the following in the description:

When teams emit product related metrics via mw.track and statsv, the data they emit is not available along side of all the 800+ datasets in the WMF Data Lake.
If we were able to produce product related operational metrics into Event Platform, they would be automatically ingested into the Data Lake and available for querying with SQL or other WMF Data Lake technologies.

This ticket is effectively developing a bridge between one of the WMF Data Lake technologies (i.e. Presto) and Prometheus.
So, in a way, these metrics will be available for querying alongside the rest of the 800+ datasets in the Data Lake, as long as you were prepared to use Presto to do so.

Our other tools such as Spark won't be able to work on these Prometheus metrics, so I wouldn't say that it's a full bridge between the Data Lake and Prometheus.
I think that all of the options mentioned in that ticket are geared towards having the events end up in Hive tables, backed by HDFS, in addition to their current location in Prometheus.

However, as @mpopov was quoted as saying in that ticket:

...for Product Analytics it would be really helpful if the data was available in data lake and could be accessed/reported with Superset (which we know how to use).
For various reasons, Web team decided to use statsv for instrumenting their small-scale experiments and thus ended up being a Grafana dashboard for the analyst to use (since we have no expertise on the team with that platform)

So this would lead me to believe that perhaps what we are discussing here would be enough to support the Product Analytics team's use-case, which is to be able to build Superset dashboards backed by this data from Prometheus.

As @CDanis originally pointed out when creating this ticket:

If we did this, we could make Prometheus timeseries available via the presto cli and wmfdata-python in Jupyter, plus we could make this data source available in Superset for very little extra effort.

I actually picked up this ticket and finally started working on it because @JAllemandou asked on Slack how to do some promql style queries as part of a data quality incident investigation. But it does seem at least tangentially relevant to the use cases mentioned on T390328.

Thank you for working on this -- for my own education, how does this relate (or if it does!) to the work in T390328: Enable querying operational (prometheus) metrics via the WMF Data Platform ?

Good question. I haven't been too involved in the conversation on T390328 yet, but it seems to me that these two tickets have some interesting commonalities.

Firstly, that ticket says the following in the description:

When teams emit product related metrics via mw.track and statsv, the data they emit is not available along side of all the 800+ datasets in the WMF Data Lake.
If we were able to produce product related operational metrics into Event Platform, they would be automatically ingested into the Data Lake and available for querying with SQL or other WMF Data Lake technologies.

This ticket is effectively developing a bridge between one of the WMF Data Lake technologies (i.e. Presto) and Prometheus.
So, in a way, these metrics will be available for querying alongside the rest of the 800+ datasets in the Data Lake, as long as you were prepared to use Presto to do so.

Our other tools such as Spark won't be able to work on these Prometheus metrics, so I wouldn't say that it's a full bridge between the Data Lake and Prometheus.
I think that all of the options mentioned in that ticket are geared towards having the events end up in Hive tables, backed by HDFS, in addition to their current location in Prometheus.

Thank you! That is quite useful to know and helps clarifying the question I had.

However, as @mpopov was quoted as saying in that ticket:

...for Product Analytics it would be really helpful if the data was available in data lake and could be accessed/reported with Superset (which we know how to use).
For various reasons, Web team decided to use statsv for instrumenting their small-scale experiments and thus ended up being a Grafana dashboard for the analyst to use (since we have no expertise on the team with that platform)

So this would lead me to believe that perhaps what we are discussing here would be enough to support the Product Analytics team's use-case, which is to be able to build Superset dashboards backed by this data from Prometheus.

As @CDanis originally pointed out when creating this ticket:

If we did this, we could make Prometheus timeseries available via the presto cli and wmfdata-python in Jupyter, plus we could make this data source available in Superset for very little extra effort.

I actually picked up this ticket and finally started working on it because @JAllemandou asked on Slack how to do some promql style queries as part of a data quality incident investigation. But it does seem at least tangentially relevant to the use cases mentioned on T390328.

I agree, there is indeed overlap with the use cases of T390328.

In the interest of moving this forward I'm not opposed to the change, however what I'm worried about is big/heavy analytical queries coming from superset and impacting thanos / operational metrics. In other words, let's go ahead with the understanding that we might be removing (or limiting, if possible?) the Prometheus / Presto connector if heavy queries do come through, and the connector will be removed once T390328 is in place. How does that sound ?

In my mind the proverbial nail in the coffin to the above problem/worry is T390328 where Prometheus data lives in the data lake itself and thus queries do not reach out to "live" systems like thanos-query.

I actually picked up this ticket and finally started working on it because @JAllemandou asked on Slack how to do some promql style queries as part of a data quality incident investigation. But it does seem at least tangentially relevant to the use cases mentioned on T390328.

I agree, there is indeed overlap with the use cases of T390328: Enable querying operational (prometheus) metrics via the WMF Data Platform.

There's overlap for sure, but I think there's some non-overlap as well -- at least with a strict reading of T390328's title and description.

It's not MediaWiki-specific events/instrumentation I was originally concerned with when I wrote in this task description:

An additional use case that could be very powerful is looking for time intervals when some condition was true in the Prometheus metrics, and then use that to perform detailed queries against webrequest.

Many different Prometheus-only, external-to-MediaWiki metrics are incredibly valuable for this use case -- things like the number of concurrent in-flight requests of each haproxy or Envoy, or the HTTP error rate at a particular CDN site, etc.

If we're talking about importing all or most of the Prometheus metrics to the Data Lake, maybe we don't need to support ad-hoc queries in the long run -- but I'm not sure offhand about the expense of that.

In the interest of moving this forward I'm not opposed to the change, however what I'm worried about is big/heavy analytical queries coming from superset and impacting thanos / operational metrics. In other words, let's go ahead with the understanding that:
...we might be removing (or limiting, if possible?) the Prometheus / Presto connector if heavy queries do come through

I'm all in favour of this. It's experimental for us and we will attempt to use Thanos responsibly. I think it's very fair to say that if this connector starts to cause operational issues for Thanos/Prometheus, then observability should feel free to disable/remove it and simply let us know why. We won't be building anything production-grade on it yet, so we'll be aware of the potential for disconnection.

...and the connector will be removed once T390328 is in place. How does that sound ?

I'm not so keen on this clause. As @CDanis points out, T390328 is only about routing a relatively small subset of mediawiki related events, generated in the browser, into the data lake.
The potential power here is to be able to correlate any operational metrics with any of the data lake metrics

Many different Prometheus-only, external-to-MediaWiki metrics are incredibly valuable for this use case -- things like the number of concurrent in-flight requests of each haproxy or Envoy, or the HTTP error rate at a particular CDN site, etc.

If we're talking about importing all or most of the Prometheus metrics to the Data Lake, maybe we don't need to support ad-hoc queries in the long run -- but I'm not sure offhand about the expense of that.

No, we're not talking about importing all of the prometheus metrics into the data late. At least not right now. I'm not sure how that would work, either, but I haven't given it any serious thought.

So if you're still ok on that basis @fgiunchedi, then I can proceed with the patch for the production presto cluster: Presto: Add a prometheus connector pointing to thanos

Thank you for the discussion and the answers to my questions, proceeding makes sense on my end!

Change #1156823 merged by Btullis:

[operations/puppet@production] Presto: Add a prometheus connector pointing to thanos

https://gerrit.wikimedia.org/r/1156823

This is now working, so we can query prometheus from presto.
Here's an example of getting some 4xx error counts from the haproxy frontends in esams.

btullis@stat1008:~$ presto
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
presto> show catalogs;
      Catalog      
-------------------
 analytics_hive    
 analytics_iceberg 
 system            
 thanos            
(4 rows)

Query 20250620_101917_00000_mrpu6, FINISHED, 15 nodes
Splits: 257 total, 257 done (100.00%)
[Latency: client-side: 0:09, server-side: 0:09] [0 rows, 0B] [0 rows/s, 0B/s]

presto> use thanos.default;

presto:default> SELECT labels['instance'] AS instance,timestamp,value FROM haproxy_frontend_http_responses_total WHERE labels['instance'] LIKE 'cp3%' AND labels['code'] = '4xx' AND labels['proxy'] = 'http' AND timestamp > (NOW() - INTERVAL '60' second);
  instance   |          timestamp          | value  
-------------+-----------------------------+--------
 cp3066:9422 | 2025-06-20 10:38:32.188 UTC |   93.0 
 cp3067:9422 | 2025-06-20 10:38:26.728 UTC |   37.0 
 cp3068:9422 | 2025-06-20 10:39:13.796 UTC |  428.0 
 cp3069:9422 | 2025-06-20 10:38:45.963 UTC |   40.0 
 cp3070:9422 | 2025-06-20 10:39:01.739 UTC |  385.0 
 cp3071:9422 | 2025-06-20 10:38:44.223 UTC |  163.0 
 cp3072:9422 | 2025-06-20 10:39:05.116 UTC |   35.0 
 cp3073:9422 | 2025-06-20 10:38:39.254 UTC | 1705.0 
 cp3074:9422 | 2025-06-20 10:39:01.710 UTC |   18.0 
 cp3075:9422 | 2025-06-20 10:38:15.567 UTC |  100.0 
 cp3075:9422 | 2025-06-20 10:39:15.557 UTC |  102.0 
 cp3076:9422 | 2025-06-20 10:38:33.761 UTC |   26.0 
 cp3077:9422 | 2025-06-20 10:38:19.506 UTC |  185.0 
 cp3078:9422 | 2025-06-20 10:38:56.534 UTC |   52.0 
 cp3079:9422 | 2025-06-20 10:39:03.734 UTC |   51.0 
 cp3080:9422 | 2025-06-20 10:38:50.428 UTC |   23.0 
 cp3081:9422 | 2025-06-20 10:38:47.734 UTC |   55.0 
(17 rows)

Query 20250620_103914_00011_mrpu6, FINISHED, 2 nodes
Splits: 17 total, 17 done (100.00%)
[Latency: client-side: 0:06, server-side: 0:06] [283K rows, 7.28MB] [51.1K rows/s, 1.31MB/s]

I have now added the thanos connector to Superset.

Here is the same query as in the previous comment, running in SQL Lab.
https://superset.wikimedia.org/sqllab/?savedQueryId=1071

image.png (791×1 px, 162 KB)

Here is the same query running in a Jupyter notebook on a stat server.

image.png (814×552 px, 113 KB)

Also, I believe that @nshahquinn-wmf knows most about wmfdata-python so may be the person best placed to say whether or not Prometheus support would be easy or difficult to add to this library.

From the Presto client point of view, is it correct that Prometheus would be accessed exactly the same as the Data Lake as now, just with a different catalog argument? If so, Wmfdata wouldn't need any changes to support that as presto.run already has a catalog argument (it currently defaults to "analytics-hive").

In answer to this question from @nshahquinn-wmf above, I didn't even need to use the catalog argument to the presto.run function, although that will surely work.
All I had to do was specify the full schema path in the query. e.g SELECT * from thanos.default.haproxy_frontend_http_responses_total

Thanks for the explanation, @BTullis! I've added a new section on the Presto Wikitech page with details on this Thanos catalog as well as general information on Presto catalogs.

Change #1164425 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Fix typo in the thanos_test catalog config for an-test-presto

https://gerrit.wikimedia.org/r/1164425

Change #1164425 merged by Btullis:

[operations/puppet@production] Fix typo in the thanos_test catalog config for an-test-presto

https://gerrit.wikimedia.org/r/1164425

Hey wow, just catching up! This is quite awesome! There is indeed overlap in intention with T390328: Enable querying operational (prometheus) metrics via the WMF Data Platform. In fact, I'd say this is a potential solution to that problem statement. There are certainly pros and cons as outlined above.

I suspect that if this works well, Presto query support will be sufficient for most of the use cases intended to be covered by T390328! If so, I would even resolve T390328 with this as the chosen solution.

I'll update that task now with this as Option 6. :)