Prometheus would provide a number of benefits for us over Graphite, most significantly, a more straightforward way to share dashboards across clusters (the way Prometheus is modeled, the cluster is an attribute of metrics that can be easily templated in Grafana).
The easiest way to set this up would seem to be jmx_exporter, a JVM agent that spins up its own in-process HTTP server to export metrics. I have (lightly) tested this in deployment-prep, and it seems to work well. One added benefit of this approach would be that we could eliminate cassandra-metrics-collector (one less application to maintain, and one less moving part on each host).
- Fork https://github.com/prometheus/jmx_exporter to the Wikimedia account, and tag a release
- Upload a build to Archiva
- Get a deployment repository setup
- Puppetize the loading of the agent, and Ferm rules
- Push to Staging and deployment-prep for further evaluation