Page MenuHomePhabricator

Create initial scaffolding for Prometheus configuration automation
Closed, ResolvedPublic

Description

With my metricsinfra long-term end goal (allow any Cloud VPS project administrator to set arbitrary prometheus scrape targets and alerting rules for their project in a self-service fashion), defining Prometheus and Alertmanager configuration in metricsinfra hiera doesn't have all the features needed (namely the self-service part), so alternative solutions are required.

My solution for this is prometheus-configurator which in the long term will talk to some API/database and enable self-service configuration. This task is to get it set up on metricsinfra with both Prometheus and Alertmanager, with a static puppet-created config file instead of a database. It will temporarily increase setup complexity without much gain in itself, but makes other wanted functionality possible which is not possible with puppet-generated config.

This unfortunately closes the possibilities to share most Puppet code with production (see T266050#6565343), but I think that the possibilities doing it this way are worth the additional complexity.

Event Timeline

taavi triaged this task as High priority.Jul 7 2021, 5:53 PM
taavi created this task.
taavi moved this task from Unsorted to Working on on the User-Majavah board.

Change 704560 had a related patch set uploaded (by Majavah; author: Majavah):

[cloud/metricsinfra/prometheus-configurator@master] add support for alertmanager config generation

https://gerrit.wikimedia.org/r/704560

Change 704560 merged by jenkins-bot:

[cloud/metricsinfra/prometheus-configurator@master] add support for alertmanager config generation

https://gerrit.wikimedia.org/r/704560

prometheus-configurator now manages the config for both prometheus and alertmanager. Only thing pending here is getting the puppet patches finished and reviewed.

Change 710068 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] metricsinfra: Add config management server

https://gerrit.wikimedia.org/r/710068

Change 710068 merged by Bstorm:

[operations/puppet@production] metricsinfra: Add config management server

https://gerrit.wikimedia.org/r/710068

Change 763664 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] metricsinfra: Use prometheus-configurator

https://gerrit.wikimedia.org/r/763664

Change 763664 merged by David Caro:

[operations/puppet@production] metricsinfra: Use prometheus-configurator

https://gerrit.wikimedia.org/r/763664