Page MenuHomePhabricator

Restore missing metrics for staging rdf-streaming-updater
Closed, ResolvedPublic

Description

Per pairing session with @dcausse , we noticed that after upgrading staging rdf-streaming-updater to 0.3.124 at ~1430 UTC today , Prometheus metrics beginning with flink_jobmanager_job no longer appear in Grafana. This is a blocker to deploying in production.

Creating this ticket to:

  • Investigate and restore missing metrics for staging rdf-streaming-updater

Event Timeline

Change 920781 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/deployment-charts@master] flink-session-cluster: fix prom reporter config

https://gerrit.wikimedia.org/r/920781

Change 920781 merged by jenkins-bot:

[operations/deployment-charts@master] flink-session-cluster: fix prom reporter config

https://gerrit.wikimedia.org/r/920781

Change 922133 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] flink-session-cluster: Increment chart version

https://gerrit.wikimedia.org/r/922133

Change 922133 merged by jenkins-bot:

[operations/deployment-charts@master] flink-session-cluster: Increment chart version

https://gerrit.wikimedia.org/r/922133

Based on the grafana dashboard for staging, these changes appear to have restored the metrics.

We will revisit this change tomorrow to confirm, and then roll out to production if nothing seems amiss.