Page MenuHomePhabricator

Don't scrape every containerPort for metrics
Open, MediumPublic

Description

Our current scrape config for k8s pods (job="k8s-pods") creates one scrape target for every containerPort of every container if the prometheus.io/port annotation is not set. This leads to quite some (intentionally) down targets and should be avoided.

We already define the schema by adding the suffix -metrics to container ports that should be scraped by prometheus, so we could limit the targets created to just them (as long as prometheus.io/port is not set).

This has one issue though: If we want to scrape multiple ports in one pod (e.g. a single prometheus.io/port is not an option) and the metrics are exposed via the generic application port, we would have to suffix that port with -metrics which is not very intuitive. Multiple containerPort entries for the same port are unfortunately not possible, but maybe there is a good way around this that I have not thought of yet.

One idea to work around this issue could be to have multiple k8s-pods scrape jobs, one per port. We would still be hardcoding the maximum number of scrape targets per pod, though:

apiVersion: v1                                                                                                                                            
kind: Pod                                                                                                                                                 
metadata:                                                                                                                                                 
  annotations:
    prometheus.io/scrape0: "true"
    prometheus.io/port0: 1234
    prometheus.io/path0: "/foo/metrics"
    prometheus.io/scrape1: "true"
    prometheus.io/port1: 4321
    ...
- job_name: k8s-pods-0
  relabel_configs:
    - action: keep
      source_labels: ['__meta_kubernetes_pod_annotations_prometheus_io_scape0']
      regex: 'true'
    # no idea if the following is possible
    - action: replace
      source_labels: ['job']
      regex: '.*'
      replacement: 'k8s-pods'
      target_label: 'job'
    ...
- job_name: k8s-pods-1
  relabel_configs:
    - action: keep
      source_labels: ['__meta_kubernetes_pod_annotations_prometheus_io_scape1']
      regex: 'true'
    # no idea if the following is possible
    - action: replace
      source_labels: ['job']
      regex: '.*'
      replacement: 'k8s-pods'
      target_label: 'job'
    ...

Event Timeline

JMeybohm triaged this task as Medium priority.Sep 27 2022, 1:32 PM
JMeybohm created this task.

I've just clarified the current behaviours on Wikitech - please update these docs in future if we change pattern!

Speaking from a position of almost total ignorance:

Do we only care about the pods spawned by the helm charts in the deployment-charts repo ? Just trying to figure out which clusters/pods are affected.

Based on @hnowlan 's last comment/wiki page update it sounds like the implicit scrape-all behavior is deprecated in favor of the -metrics suffix, and we should ask the chart owners to update to the new syntax. Or maybe we could require prometheus.io/path to be set explicitly if the idea is to use the same port for services and prometheus metrics?

Speaking from a position of almost total ignorance:

Do we only care about the pods spawned by the helm charts in the deployment-charts repo ? Just trying to figure out which clusters/pods are affected.

Basically yes. But ultimately this will affect all pods in all clusters - and we should try to keep backwards compatibility as far as possible.

Based on @hnowlan 's last comment/wiki page update it sounds like the implicit scrape-all behavior is deprecated in favor of the -metrics suffix, and we should ask the chart owners to update to the new syntax. Or maybe we could require prometheus.io/path to be set explicitly if the idea is to use the same port for services and prometheus metrics?

I think the addition from Hugh was not completely correct (updated the page). AIUI currently it is as follows:

If prometheus.io/port is set, prometheus scrapes <pod ip>:<prometheus.io/port>/<prometheus.io/path | default /metrics>
If prometheus.io/port is not set, prometheus scrapes <pod ip>:<containerPort>/<prometheus.io/path | default /metrics> for every containerPort that is defined.

bking removed bking as the assignee of this task.Oct 4 2022, 1:16 PM

Unassigning for now, will circle back later this week or next week to discuss further.