Page MenuHomePhabricator

Metrics and dashboards for API Gateway
Closed, ResolvedPublic

Description

Description
Provide metrics and dashboards to monitor and assess API Gateway

Done Criteria

  • Sufficient metrics are gathered from Envoy instances via Prometheus
  • Dashboard are built and available to provide insight into metrics from Envoy API Gateway

Event Timeline

Is this for product-level metrics, like how many hits to the service and by whom? Or administrative metrics, like resources used and uptime?

Administrative, so that we can ascertain service health and performance

Change 623012 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: Collect metrics

https://gerrit.wikimedia.org/r/623012

Change 623012 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: Collect metrics

https://gerrit.wikimedia.org/r/623012

Change 623399 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: expose port for admin interface

https://gerrit.wikimedia.org/r/623399

Change 623399 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: expose port for admin interface

https://gerrit.wikimedia.org/r/623399

Change 623568 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: Use envoyproxy.io annotation for metrics gathering.

https://gerrit.wikimedia.org/r/623568

Change 623568 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: Use envoyproxy.io annotation for metrics gathering.

https://gerrit.wikimedia.org/r/623568

Change 623624 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: Add mappings for ratelimit service

https://gerrit.wikimedia.org/r/623624

Change 623624 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: Add mappings for ratelimit service

https://gerrit.wikimedia.org/r/623624

Change 624006 had a related patch set uploaded (by Hnowlan; owner: Hnowlan):
[operations/deployment-charts@master] api-gateway: Fix syntax in metrics gathering

https://gerrit.wikimedia.org/r/624006

Change 624006 merged by jenkins-bot:
[operations/deployment-charts@master] api-gateway: Fix syntax in metrics gathering

https://gerrit.wikimedia.org/r/624006

Dashboard is looking okay, this is 90% done. Moving to blocked until T235277 is unblocked - can't graph non-anonymous keys until we have actual usage metrics.