Page MenuHomePhabricator

Upgrade alert* hosts to Bookworm
Open, In Progress, HighPublic

Description

Implementation steps:

  • Set up a Bookworm alerting_host in Pontoon
  • Check that Puppet runs as expected (e.g. no packages missing, etc)
  • Check that daemons can start, configurations are valid, etc
  • Reimage standby host in production with Bookworm, validate things run as expected (alertmanager, icinga, etc). We might need to silence meta-monitoring.
  • Switch over to the standby host, reimage the active host and flip back

List of missing packages:

PackageInstalled versionUpstream versionWorks on Bookworm?
alertmanager-webhook-loggerv0.3v1.0Yes
icingaNot available for Bullseye nor Bookworm, backport is doable
karmav0.114v0.116Yes
kthxbyev0.8v0.16Yes
phalerts60942d8e2a0b3a (+1 commit)Yes
prometheus-icinga-exporterv0.20v0.20Yes
python-ircv8.5.3v20.3.0~Yes (Python3 version available)
python-phabricatorv0.7.0v0.8.1Yes
python-pyinotifyv0.9.6v0.9.6~Yes (Python3 version available)
python3-service-checkerv0.2.1v0.2.1Yes
statographv0.1.2v0.1.2Yes
vopsbotv0.3.6v0.3.6Yes

Event Timeline

Change 934245 had a related patch set uploaded (by Andrea Denisse; author: Andrea Denisse):

[operations/puppet@production] alert: Add the alert (icinga + alertmanager) hosts Bookworm node definitions

https://gerrit.wikimedia.org/r/934245

Followup from IRC: this upgrade will bring us to Alertmanager 0.25, which in turn should support newer versions of https://github.com/prymitive/karma and therefore we can deploy an upgraded version, including a patch from @dcaro

andrea.denisse changed the task status from Open to In Progress.Jul 13 2023, 4:28 PM
andrea.denisse triaged this task as High priority.

Change 944289 had a related patch set uploaded (by Andrea Denisse; author: Andrea Denisse):

[operations/puppet@production] pontoon: Apply the 'alerting_host' role to the pontoon-alerting-host-01 host

https://gerrit.wikimedia.org/r/944289

Change 944289 merged by Andrea Denisse:

[operations/puppet@production] pontoon: Apply the 'alerting_host' role to the pontoon-alerting-host-01 host

https://gerrit.wikimedia.org/r/944289

Change 934245 abandoned by Andrea Denisse:

[operations/puppet@production] alert: Add the alert (icinga + alertmanager) hosts Bookworm node definitions

Reason:

https://gerrit.wikimedia.org/r/934245

andrea.denisse updated the task description. (Show Details)
andrea.denisse updated the task description. (Show Details)

Change 972925 had a related patch set uploaded (by Andrea Denisse; author: Andrea Denisse):

[operations/puppet@production] icinga: Remove unnecessary python-phabricator Python2 dependency

https://gerrit.wikimedia.org/r/972925

Change 972929 had a related patch set uploaded (by Andrea Denisse; author: Andrea Denisse):

[operations/puppet@production] ircecho: Migrate IRC Echo from Python 2 to Python 3

https://gerrit.wikimedia.org/r/972929

Change 972925 merged by Andrea Denisse:

[operations/puppet@production] icinga: Remove unnecessary python-phabricator Python2 dependency

https://gerrit.wikimedia.org/r/972925

When you reimage to Bookworm, please make sure to directly reimage them into the Puppet 7 environment (by passing the -p7 parameter to the reimage cookbook). The alert* hosts are currently on Buster for which we can't use Puppet 7, that's why the current alert* hosts haven't been migrated to Puppet 7 yet.