Page MenuHomePhabricator

Injest Toolhub custom prometheus metrics
Closed, InvalidPublic

Description

Toolhub exposes a number of custom metrics at a /metrics endpoint which are produced by https://github.com/korfuri/django-prometheus. These metrics should be scraped from each pod in the Kubernetes deployments so that they can be used in grafana and other reporting.

Event Timeline

bd808 triaged this task as Medium priority.Apr 11 2022, 10:44 PM
bd808 added a project: observability.
bd808 moved this task from Backlog to Groomed/Ready on the Toolhub board.

How can one get started working on this @bd808 ? any pointers? especially the kubernetes scraping part

How can one get started working on this @bd808 ? any pointers? especially the kubernetes scraping part

This is yet another area where Toolhub is an early adopter and there does not yet seem to be strong documentation on how to proceed. I think the answer is going to be something like "make a patch to operations/puppet.git". https://github.com/wikimedia/puppet/blob/production/modules/profile/files/prometheus/rules_k8s.yml might be the right place, but I'm not sure.

I would recommend trying to contact folks like @fgiunchedi from SRE Observability or @akosiaris from serviceops to ask for some advice.

How can one get started working on this @bd808 ? any pointers? especially the kubernetes scraping part

This is yet another area where Toolhub is an early adopter and there does not yet seem to be strong documentation on how to proceed. I think the answer is going to be something like "make a patch to operations/puppet.git". https://github.com/wikimedia/puppet/blob/production/modules/profile/files/prometheus/rules_k8s.yml might be the right place, but I'm not sure.

I would recommend trying to contact folks like @fgiunchedi from SRE Observability or @akosiaris from serviceops to ask for some advice.

Not sure I see how Toolhub could be an early adopter of metrics scraping. We 've been doing this since day 1, so ~2017. That being said docs haven't been great indeed. There was some stuff at https://wikitech.wikimedia.org/wiki/Prometheus/statsd_k8s but I 've gone ahead and created https://wikitech.wikimedia.org/wiki/Kubernetes/Metrics for better overview as well as placement in the main portal of our docs.

Toolhub wise, metrics are already being injected for a pretty long time now. The helm chart has the prometheus.io/scrape: true annotation so prometheus has been scraping workloads/pods since day 1 of deployment. This is pretty clear in https://w.wiki/53ic where one can see that we are scraping since Oct 27th.

I am gonna close this as invalid, but feel free to reopen

Not sure I see how Toolhub could be an early adopter of metrics scraping.

My assumption was that we were an early adopter of running in k8s as anything other than a nodejs service and as such would need to do something to trigger integration.

Toolhub wise, metrics are already being injected for a pretty long time now. The helm chart has the prometheus.io/scrape: true annotation so prometheus has been scraping workloads/pods since day 1 of deployment.

I'm pretty sure that that annotation is only there because it was in the scaffolding. And magically I mounted the exporter at the correct /metrics endpoint. Convention over configuration is awesome. Thanks for making this part easy. :)