Page MenuHomePhabricator

EventGate wikimedia implementation should emit rdkafka stats
Closed, ResolvedPublic5 Story Points

Related Objects

Event Timeline

Ottomata created this task.Mar 14 2019, 2:58 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 14 2019, 2:58 PM

Change 496477 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgaate-analytics - Enable rdkafka statsd metrics

https://gerrit.wikimedia.org/r/496477

Change 496477 merged by Ottomata:
[operations/deployment-charts@master] eventgaate-analytics - Enable rdkafka statsd metrics

https://gerrit.wikimedia.org/r/496477

Change 496554 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd metrics

https://gerrit.wikimedia.org/r/496554

Milimetric triaged this task as High priority.
Ottomata moved this task from Backlog to In Progress on the EventBus board.Mar 18 2019, 8:46 PM

Change 496554 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics - Add statsd prometheus mappings for node-rdkafka-statsd metrics

https://gerrit.wikimedia.org/r/496554

Change 497612 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-analytics - adjustments to statsd exporter matches

https://gerrit.wikimedia.org/r/497612

Change 497612 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics - adjustments to statsd exporter matches

https://gerrit.wikimedia.org/r/497612

Change 497645 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-analytics Fix misplaced '_histogram' metric suffix

https://gerrit.wikimedia.org/r/497645

Change 497645 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics Fix misplaced '_histogram' metric suffix

https://gerrit.wikimedia.org/r/497645

Change 497763 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-analytics - remove confusing '_histogram' suffix from summary quantiles

https://gerrit.wikimedia.org/r/497763

Change 497763 merged by Ottomata:
[operations/deployment-charts@master] eventgate-analytics - remove confusing '_histogram' suffix from summary quantiles

https://gerrit.wikimedia.org/r/497763

@akosiaris, @fgiunchedi when you get a chance I'd appreciate a lookover of this dashboard:

https://grafana.wikimedia.org/d/ePFPOkqiz/eventgate-analytics-otto0?refresh=1m&orgId=1&from=1553097349177&to=1553100949177&var-dc=eqiad%20prometheus%2Fk8s-staging&var-service=eventgate-analytics&var-kafka_producer_type=All&var-kafka_broker=All&var-kafka_topic=All

I went ahead and put the Kafka graphs I added into the existing rows (sorry Filippo if you really don't like I can move back to a Kafka specific one!). The ones I added just fit nicely in each of the golden signal categories. Now each row has appropriate 'signal' graphs for both HTTP and Kafka.

@akosiaris, @fgiunchedi when you get a chance I'd appreciate a lookover of this dashboard:

https://grafana.wikimedia.org/d/ePFPOkqiz/eventgate-analytics-otto0?refresh=1m&orgId=1&from=1553097349177&to=1553100949177&var-dc=eqiad%20prometheus%2Fk8s-staging&var-service=eventgate-analytics&var-kafka_producer_type=All&var-kafka_broker=All&var-kafka_topic=All

I went ahead and put the Kafka graphs I added into the existing rows (sorry Filippo if you really don't like I can move back to a Kafka specific one!).

Heh, actually the template structure was mine ;-).

The ones I added just fit nicely in each of the golden signal categories.  Now each row has appropriate 'signal' graphs for both HTTP and Kafka.

I think it's fine. +1

Note there is 1 interesting thing that recently showed up and it's the amount of graphs that should by default be open in each dashboard. I am not sure what's a sensible number of graphs, I see numbers like 5 or 7 on the internetz.

The issue is not just visual clutter btw, it's also causing a a lot of requests on the backend prometheus service.

I am still not sure how to solve this, I am open to suggestions (and possibly track it in a different task), which is why I just brought it up. I do have ideas like closing all rows by default and adding an "Overview"" row.

I like that idea! I'll make an overview.

Ottomata set the point value for this task to 5.
Ottomata moved this task from Next Up to Done on the Analytics-Kanban board.
Ottomata moved this task from In Progress to Done on the EventBus board.Mar 25 2019, 6:35 PM
Nuria closed this task as Resolved.Mar 29 2019, 2:58 PM