Page MenuHomePhabricator

Provision prometheus instance for cassandra/services metrics collection
Closed, ResolvedPublic

Description

JMX exporter for cassandra is enabled in the dev cluster now, we should provision a new prometheus instance in codfw and eqiad (named services) to collect such metrics

Event Timeline

Restricted Application removed a project: Patch-For-Review. · View Herald TranscriptAug 17 2017, 9:48 AM

Change 372357 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] WIP: new prometheus instance 'services'

https://gerrit.wikimedia.org/r/372357

Change 372357 merged by Filippo Giunchedi:
[operations/puppet@production] prometheus: new instance 'services'

https://gerrit.wikimedia.org/r/372357

Prometheus instance is up and running, still missing the "targets" generation, i.e. the cassandra instances that are currently running jmx_exporter.

I can't seem to get the following to work to extract all hosts that have prometheus::jmx_exporter_instance defined:

root@puppetmaster1001:~# puppet apply -e 'notice(query_resources(false, "Define[\"prometheus::jmx_exporter_instance\"]", false))'                                                                                                             Notice: Scope(Class[main]):
fgiunchedi moved this task from Backlog to Doing on the User-fgiunchedi board.Aug 21 2017, 1:51 PM

Change 372845 had a related patch set uploaded (by Filippo Giunchedi; owner: Filippo Giunchedi):
[operations/puppet@production] role: collect from restbase test_cluster

https://gerrit.wikimedia.org/r/372845

Change 372845 merged by Filippo Giunchedi:
[operations/puppet@production] role: collect jmx_exporter metrics from restbase test_cluster

https://gerrit.wikimedia.org/r/372845

fgiunchedi closed this task as Resolved.Sep 7 2017, 9:43 AM

This is done, the services Prometheus instance is up and running and collecting cassandra metrics