Page MenuHomePhabricator

Collect metrics from envoy where it is enabled on k8s
Closed, ResolvedPublic

Description

Envoy exposes rich statistics that provide, amongst other things, telemetry about latencies of all upstream services, that we want to collect.

It exposes them via the /stats/prometheus endpoint on its admin interface.

So we need to:

  • Allow exposing the admin interface from the container
  • Add proper annotations to the deployment to let prometheus scrape the k8s API
  • Add a specialized job to prometheus to actually collect the data

Event Timeline

Change 549825 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/docker-images/production-images@master] envoy-tls-local-proxy: require configuration of the admin endpoint

https://gerrit.wikimedia.org/r/549825

Change 549837 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/deployment-charts@master] blubberoid: add telemetry collection support for envoy

https://gerrit.wikimedia.org/r/549837

Change 549872 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] kubernetes::deployment_server: Add a private/general.yaml file

https://gerrit.wikimedia.org/r/549872

Change 549871 had a related patch set uploaded (by Giuseppe Lavagetto; owner: Giuseppe Lavagetto):
[operations/puppet@production] prometheus: add scraping of k8s envoy sidecars

https://gerrit.wikimedia.org/r/549871

Change 549825 merged by Giuseppe Lavagetto:
[operations/docker-images/production-images@master] envoy-tls-local-proxy: require configuration of the admin endpoint

https://gerrit.wikimedia.org/r/549825

Change 549871 merged by Giuseppe Lavagetto:
[operations/puppet@production] prometheus: add scraping of k8s envoy sidecars

https://gerrit.wikimedia.org/r/549871

Change 549872 merged by Giuseppe Lavagetto:
[operations/puppet@production] kubernetes::deployment_server: Add a private/general.yaml file

https://gerrit.wikimedia.org/r/549872

Change 549837 merged by Giuseppe Lavagetto:
[operations/deployment-charts@master] blubberoid: add telemetry collection support for envoy

https://gerrit.wikimedia.org/r/549837

Joe triaged this task as Medium priority.