Page MenuHomePhabricator

Systemd enhancements for mariadb and prometheus-mysql-exporter
Closed, ResolvedPublic

Description

This task captures some improvements to the systemd handling of the mariadb and prometheus-mysql-exporter ("PME") services; the aim being to simplify their management. Particularly, the problem that currently you have to remember to restart PME after re-starting mariadb; and that on multi-instance hosts you have to handle each instance separately.

  • Couple PME and mariadb services on single-instance hosts
  • Couple PME and mariadb services on multi-instance hosts
  • systemd targets so systemctl restart mariadb.target DTRT on single and multi-instance hosts (optional)

Event Timeline

Change 714358 had a related patch set uploaded (by MVernon; author: MVernon):

[operations/puppet@production] prometheus: couple mysqld exporter service to mariadb service

https://gerrit.wikimedia.org/r/714358

I know it is not totally related to this task, but maybe this can be also looked at as part of this? T257056: Add alert for prometheus-mysql-exporter failing to scrape mysql

LSobanski triaged this task as Medium priority.Mon, Aug 30, 6:55 AM
LSobanski moved this task from Triage to In progress on the DBA board.

Change 715926 had a related patch set uploaded (by MVernon; author: MVernon):

[operations/software@master] dbtools: make mariadb service Wants prometheus-mysqld-exporter

https://gerrit.wikimedia.org/r/715926

Change 714358 merged by MVernon:

[operations/puppet@production] prometheus: couple mysqld exporter service to mariadb service

https://gerrit.wikimedia.org/r/714358

Change 715926 merged by MVernon:

[operations/software@master] dbtools: make mariadb service Wants prometheus-mysqld-exporter

https://gerrit.wikimedia.org/r/715926

Change 716306 had a related patch set uploaded (by MVernon; author: MVernon):

[operations/puppet@production] prometheus: couple mysqld export service to mariadb (multi-instance)

https://gerrit.wikimedia.org/r/716306

Change 716306 merged by MVernon:

[operations/puppet@production] prometheus: couple mysqld export service to mariadb (multi-instance)

https://gerrit.wikimedia.org/r/716306

I know it is not totally related to this task, but maybe this can be also looked at as part of this? T257056: Add alert for prometheus-mysql-exporter failing to scrape mysql

I will have at least a look at that, but I think that I'm not going to make resolving it part of this task, IYSWIM?

I think we concluded that the mariadb.target idea isn't all that useful (since mostly folk don't stop and start >1 instance at once I think @Kormat said).

So I'm inclined to view this task as done now? We have the coupling of prometheus-mysqld-exporter and mariadb done by system on both single and multi-instance hosts.

I know it is not totally related to this task, but maybe this can be also looked at as part of this? T257056: Add alert for prometheus-mysql-exporter failing to scrape mysql

I will have at least a look at that, but I think that I'm not going to make resolving it part of this task, IYSWIM?

👍

I know it is not totally related to this task, but maybe this can be also looked at as part of this? T257056: Add alert for prometheus-mysql-exporter failing to scrape mysql

I will have at least a look at that, but I think that I'm not going to make resolving it part of this task, IYSWIM?

Yes, it can be tracked on the other task itself. Not a blocker for closing this one as soon as you think it is appropriate to do so.

Thanks for your work!

(Re-opening as until the new mariadb packages are built and deployed we're in a slightly worse situation then before the work started)

Tl;DR: Until we have new packages build/deployed, prometheus-mysqld-exporter will need to be manually started on hosts after a reboot.

In the original state the exporter would be running, but would fail to connect to mariadb. This has the minor benefit that systemctl will tab-complete to the service correctly.
In the intermediate state we're currently in, systemd won't start PME on its own initiative, and the new Wants=PME in mariadb's .service isn't present yet.

Change 721284 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/software@master] control-mariadb-*: Bump version

https://gerrit.wikimedia.org/r/721284

Change 721284 merged by jenkins-bot:

[operations/software@master] control-mariadb-*: Bump version

https://gerrit.wikimedia.org/r/721284

The new package 10.4.21-2 includes this patch.
It has been installed on db1125 (test host), so feel free to play with it if needed. I have done several stop/start and all works fine.

I've also checked the stop/start/restart behaviour, which is as expected.
Also, that on reboot PME isn't started, but when you start mariadb, it does then get started for you.

We can test tomorrow on a multi instance host in codfw if you like (as today we are still not touching production to let the dust settle from yesterday's switchover)

Marking this as resolved - as we deploy 10.4.21-2 everywhere, the fix will get rolled out.