Page MenuHomePhabricator

Nodepool should send metrics to statsd
Closed, ResolvedPublic

Description

Missed to setup statsd env variable in Nodepool to have it send its metrics to Graphite.

Details

Related Gerrit Patches:
operations/puppet : productionnodepool: send metrics to statsd

Event Timeline

hashar raised the priority of this task from to Low.
hashar updated the task description. (Show Details)
hashar added a subscriber: hashar.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 4 2015, 8:21 AM
hashar moved this task from Backlog to In-progress on the Continuous-Integration-Scaling board.
hashar set Security to None.

Change 235989 had a related patch set uploaded (by Hashar):
nodepool: send metrics to statsd

https://gerrit.wikimedia.org/r/235989

Change 235989 abandoned by Hashar:
nodepool: send metrics to statsd

Reason:
Abandonning to clear up others Gerrit dashboards. Will restore when Nodepool send less metrics.

https://gerrit.wikimedia.org/r/235989

hashar changed the task status from Open to Stalled.Sep 4 2015, 9:20 AM
hashar removed a project: Patch-For-Review.

Per discussion on https://gerrit.wikimedia.org/r/#/c/235989/ Nodepool sends too many metrics which is going to overload our Statsd server eventually.

Suggested by Filippo, I looked at Nodepool and on a job success it reports 8 metrics. So the more jobs we run the fastest we will exhaust our Graphite server.

An example for a given job 'npm':

nodepool.job.npm.runtime (timing)
nodepool.job.npm.builds (count)
nodepool.job.npm.<branch>.runtime (timing)
nodepool.job.npm.<branch>.builds (count)
nodepool.job.npm.<branch>.<jenkins_label>.runtime (timing)
nodepool.job.npm.<branch>.<jenkins_label>.builds (count)
nodepool.job.npm.<branch>.<jenkins_label>.<name>.runtime (timing)
nodepool.job.npm.<branch>.<jenkins_label>.<cloud name>.builds (count)

Their metrics can be further browsed at http://graphite.openstack.org/

It has a whole lot more of metrics that are undocumented. I guess I will doc them to upstream and propose a patch to disable per jobs reporting.

hashar removed hashar as the assignee of this task.Oct 7 2015, 8:12 PM

Not currently working on it

hashar closed this task as Resolved.Aug 23 2016, 9:04 AM
hashar assigned this task to chasemp.
hashar added a subscriber: thcipriani.

Nodepool now report statistics to statsd. Has been done via https://gerrit.wikimedia.org/r/#/c/305529/

We will want to patch Nodepool to stop sending metrics per jobs T111504 can be done later on though.

Thank you Chase.

And I have created a basic dashboard in Grafana at https://grafana-admin.wikimedia.org/dashboard/db/nodepool