Page MenuHomePhabricator

Add Prometheus exporter to Jenkins instances
Open, HighPublic

Description

The CI Jenkins instances have the https://plugins.jenkins.io/monitoring monitoring plugin installed. It can exposes an endpoint for Prometheus to scape.

We might use the generic JMX exporter, though the Jenkins plugin might exposes a little more informations that are specific to Jenkins. The plugin is based on JavaMelody

Documentation: https://github.com/javamelody/javamelody/wiki/UserGuideAdvanced#exposing-metrics-to-prometheus

Jenkins instances I am aware of:

contint1001CI master
contint2001CI spare, stopped
releases1001for releasing
relaases1002spare

Event Timeline

@hashar on what url are the metrics available? I tried localhost:8080/monitoring on contint1001 but yields 404:

contint1001:~$ curl localhost:8080/monitoring 
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<title>Error 404 Not Found</title>
</head>
<body><h2>HTTP ERROR 404</h2>
<p>Problem accessing /monitoring. Reason:
<pre>    Not Found</pre></p><hr><a href="http://eclipse.org/jetty">Powered by Jetty:// 9.4.z-SNAPSHOT</a><hr/>

</body>
</html>

re: jmx-exporter we can deploy that in addition to jenkins' one, so we get standard/uniform jvm metrics as per parent task

The Jenkins instance uses /ci path prefix, and from https://github.com/javamelody/javamelody/wiki/UserGuideAdvanced#exposing-metrics-to-prometheus requires ?format=prometheus. The URLs would be:

A copy of all exposed metrics: P6519

We would need the Administer permission though:

contint1001:~$ curl http://localhost:8080/ci/monitoring?format=prometheus
$ curl 'https://integration.wikimedia.org/ci/monitoring?format=prometheus'
<html><head><meta http-equiv='refresh' content='1;url=/ci/login?from=%2Fci%2Fmonitoring%3Fformat%3Dprometheus'/><script>window.location.replace('/ci/login?from=%2Fci%2Fmonitoring%3Fformat%3Dprometheus');</script></head><body style='background-color:white; color:white;'>


Authentication required
<!--
You are authenticated as: anonymous
Groups that you are in:
  
Permission you need to have (but didn't): hudson.model.Hudson.Administer
-->

</body></html>

From the doc https://github.com/javamelody/javamelody/wiki/UserGuideAdvanced#5-security-with-a-collect-server we would want to tweak some settings:

-Djavamelody.plugin-authentication-disabled=true
-Djavamelody.allowed-addr-pattern=127.0.0.1  # IP of collecting server

Maybe we can just use the jmx-exporter , that most probably yields more or less the same metrics and would be good enough to monitor the java vm.

As of this morning both Jenkins master have the Prometheus plugin installed and enabled. The plugin will allows them to be used as Prometheus targets (at {jenkins-url}/prometheus) for collecting all sorts of build, node, and Jenkins master related metrics.

However, the plugin seems to have issues when "Fetch the test results of builds" is checked in the plugin configuration. DO NOT ENABLE THIS CONFIGURATION. @thcipriani and I observed high memory usage and request timeouts when this option was selected; we eventually tried killing the request thread and even then it continued to process for over 15 minutes.

We may have to go without individual metrics for tests and test suites for now, but the plugin as it's current configured provides a good starting point for Prometheus based metrics collection.

As of this morning both Jenkins master have the Prometheus plugin installed and enabled. The plugin will allows them to be used as Prometheus targets (at {jenkins-url}/prometheus) for collecting all sorts of build, node, and Jenkins master related metrics.

However, the plugin seems to have issues when "Fetch the test results of builds" is checked in the plugin configuration. DO NOT ENABLE THIS CONFIGURATION. @thcipriani and I observed high memory usage and request timeouts when this option was selected; we eventually tried killing the request thread and even then it continued to process for over 15 minutes.

We may have to go without individual metrics for tests and test suites for now, but the plugin as it's current configured provides a good starting point for Prometheus based metrics collection.

Thanks for working on this! Happy to help with more Prometheus questions if needed. re: individual builds metrics as you noted exporting all builds as individual metrics is not feasible. Unless builds metrics get aggregated inside jenkins as a single/few metrics it sounds more like a logging use case than a metrics one (i.e. log build times to logstash)

dduvall changed the task status from Open to Stalled.Oct 2 2019, 5:20 PM

Work on this has stalled. I've uninstalled the Prometheus plugin from Jenkins for now.

Marking this as "declined" to remove the task from view. We can always revive or reference this task should be pick the work back up.

hashar added a project: observability.

Reopening, we need at least the JVM metrics to be exported so we can monitor its behavior. Part of T177197.

Aklapper added a subscriber: dduvall.

Removing task assignee due to inactivity, as this open task has been assigned to the same person for more than two years (see the emails sent to the task assignee on Oct27 and Nov23). Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.
(See https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.)