Page MenuHomePhabricator

Enable self-service Prometheus configuration management for project administrators
Open, MediumPublic

Description

Project administrators should be able to configure Prometheus scrape targets and alert rules for their project without making changes to operations/puppet. In the long term there are two optimal ways to achieve this this that I can see:

  • Enable management via Hiera/Puppet
    • Pro: Nice to deal with in a project that is otherwise managed with Puppet
    • Con (?): Difficult to use - is Hiera easy enough for the target audience?
    • Con: Difficult to get proper authentication done
    • Con: how to deal with services that are not bound to a single VM - take Kubernetes pods for example
  • Create a web UI/Horizon interface
    • Pro: Ease of use
    • Con: Harder to get something like "Scrape all Toolforge Redis hosts on port X" automated
    • Con: Requires manual clicking for large projects managed with Puppet
    • Con: either have to deal with developer account authentication on cloud realm or have a prod-cloud connection

Bonus points if the solution can automatically make sure the required security group rules are present.

My short-term plan is to create a tool that you can customize with per-project config files ("Scrape all Toolforge Redis hosts on port X", "Alerting rule Y is there") and that creates full configuration for Prometheus and Alertmanager. It's rather bare-bones, but it's better than the current static configuration and gives us a good foundation to continue development, for example to add a database and api to modify rules.

Related Objects

StatusSubtypeAssignedTask
Resolvedfgiunchedi
Resolvedcolewhite
ResolvedMoritzMuehlenhoff
StalledNone
OpenNone
Opentaavi
Opentaavi
OpenNone
Resolvedtaavi
Resolvedtaavi
OpenNone
OpenNone
Resolvedtaavi
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone