Page MenuHomePhabricator

Setup an initial bookworm host pair with Puppetdb 7
Closed, ResolvedPublic

Description

We want to move to puppetdb 7 and initially run it on Bookworm. This will still see quite a bit of changes given it's not frozen yet, but backporting puppetdb 7 to the Clojure stack in Bullseye is also a significant undertaking (and this way we can also help with getting puppetdb in shape in Debian).

The Debian installer isn't yet adapted to install Bookworm in our infra, so initially a bullseye node will be dist-upgraded (and minimal puppet changes deployed to support the core base classes).

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -1
operations/puppetproduction+8 -4
operations/puppetproduction+3 -2
operations/puppetproduction+1 -6
operations/puppetproduction+6 -0
operations/puppetproduction+0 -1
operations/puppetproduction+5 -1
operations/puppetproduction+2 -4
operations/puppetproduction+3 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+27 -16
operations/puppetproduction+38 -19
operations/puppetproduction+3 -8
operations/puppetproduction+18 -19
operations/puppetproduction+23 -16
operations/puppetproduction+4 K -0
operations/puppetproduction+4 -0
operations/puppetproduction+61 -0
operations/puppetproduction+0 -11
operations/puppetproduction+2 -2
operations/puppetproduction+6 -6
operations/puppetproduction+49 -15
operations/puppetproduction+9 -0
operations/puppetproduction+22 -13
operations/puppetproduction+0 -1
operations/puppetproduction+6 -1
operations/puppetproduction+1 -0
operations/puppetproduction+3 -0
operations/puppetproduction+79 -1
operations/puppetproduction+2 -0
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+10 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -8
operations/puppetproduction+31 -0
operations/puppetproduction+1 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Mentioned in SAL (#wikimedia-operations) [2022-11-29T09:17:13Z] <moritzm> update component/puppetdb7 to puppetdb 7.11.2-3 (fixing Postgres 15 compat) T321783

Mentioned in SAL (#wikimedia-operations) [2022-11-29T10:15:49Z] <moritzm> upgrading puppetdb2003 to bookworm T321783

Mentioned in SAL (#wikimedia-operations) [2022-11-29T11:37:43Z] <moritzm> uploaded ferm 2.5.1-1.1+wmf11u1 to apt.wikimedia.org/bookworm (rebasing our systemd startup fixes to what's in bookworm) T321783

Mentioned in SAL (#wikimedia-operations) [2022-11-30T11:19:53Z] <moritzm> upgrade puppetdb1003 to bookworm T321783

Change 862256 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Add Hiera settings for second bookworm puppetdb pair

https://gerrit.wikimedia.org/r/862256

Change 862260 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] postgresql: Add bookworm support

https://gerrit.wikimedia.org/r/862260

Change 862260 merged by Muehlenhoff:

[operations/puppet@production] postgresql: Add bookworm support

https://gerrit.wikimedia.org/r/862260

Change 862256 merged by Muehlenhoff:

[operations/puppet@production] Add Hiera settings for second bookworm puppetdb pair

https://gerrit.wikimedia.org/r/862256

Change 863286 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] postgresql::server: Add bookworm support

https://gerrit.wikimedia.org/r/863286

Change 863286 merged by Muehlenhoff:

[operations/puppet@production] postgresql::server: Add bookworm support

https://gerrit.wikimedia.org/r/863286

Change 864664 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] uwsgi: Add support for bookworm

https://gerrit.wikimedia.org/r/864664

Change 864664 merged by Muehlenhoff:

[operations/puppet@production] uwsgi: Add support for bookworm

https://gerrit.wikimedia.org/r/864664

Change 868078 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Don't install quickstack on Bookworm, revisit later

https://gerrit.wikimedia.org/r/868078

Change 868078 merged by Muehlenhoff:

[operations/puppet@production] Don't install quickstack on Bookworm, revisit later

https://gerrit.wikimedia.org/r/868078

Mentioned in SAL (#wikimedia-operations) [2022-12-15T15:05:02Z] <moritzm> imported prometheus-jmx-exporter for bookworm-wikimedia T321783

Change 868431 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] nginx: let puppet pick the correct provider

https://gerrit.wikimedia.org/r/868431

Change 868471 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] monitoring: update monitoring files to dynamically discover config

https://gerrit.wikimedia.org/r/868471

Change 868431 merged by Muehlenhoff:

[operations/puppet@production] nginx: let puppet pick the correct provider

https://gerrit.wikimedia.org/r/868431

Change 869199 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] httpd: Let Puppet pick the init provider

https://gerrit.wikimedia.org/r/869199

Change 868471 merged by Jbond:

[operations/puppet@production] monitoring: update monitoring files to dynamically discover config

https://gerrit.wikimedia.org/r/868471

Change 869716 had a related patch set uploaded (by Jbond; author: Jbond):

[operations/puppet@production] monitoring: update monitoring files to dynamically discover config

https://gerrit.wikimedia.org/r/869716

Change 874852 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Mask uwsgi on puppetdb hosts

https://gerrit.wikimedia.org/r/874852

Change 874891 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] monitoring: convert prometheus-puppet-agent-stats to pathlib

https://gerrit.wikimedia.org/r/874891

Change 874852 merged by Muehlenhoff:

[operations/puppet@production] Mask uwsgi on puppetdb hosts

https://gerrit.wikimedia.org/r/874852

Change 875272 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Properly name replication records for second puppetdb pair

https://gerrit.wikimedia.org/r/875272

Change 869716 merged by Jbond:

[operations/puppet@production] monitoring: update monitoring files to dynamically discover config

https://gerrit.wikimedia.org/r/869716

Change 875272 merged by Muehlenhoff:

[operations/puppet@production] Properly name replication records for second puppetdb pair

https://gerrit.wikimedia.org/r/875272

Change 875301 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] puppetdb/bookworm: One more typo in the config

https://gerrit.wikimedia.org/r/875301

Change 875301 merged by Muehlenhoff:

[operations/puppet@production] puppetdb/bookworm: One more typo in the config

https://gerrit.wikimedia.org/r/875301

Change 878001 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Decom puppetdb-test2001

https://gerrit.wikimedia.org/r/878001

Change 878001 merged by Muehlenhoff:

[operations/puppet@production] Decom puppetdb-test2001

https://gerrit.wikimedia.org/r/878001

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: puppetdb-test2001.codfw.wmnet

  • puppetdb-test2001.codfw.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster codfw to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster codfw to Netbox

Change 883184 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] puppetdb: add auth.conf file

https://gerrit.wikimedia.org/r/883184

Change 883184 merged by Jbond:

[operations/puppet@production] puppetdb: add auth.conf file

https://gerrit.wikimedia.org/r/883184

Change 883189 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] puppet: install puppet-module-puppetlabs-augeas-core on puppet >= 6

https://gerrit.wikimedia.org/r/883189

Change 883189 merged by Jbond:

[operations/puppet@production] puppet: install puppet-module-puppetlabs-augeas-core on puppet >= 6

https://gerrit.wikimedia.org/r/883189

Change 883233 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] augeas_core: add augeas core module to the vendor modules

https://gerrit.wikimedia.org/r/883233

Change 883233 merged by Jbond:

[operations/puppet@production] augeas_core: add augeas core module to the vendor modules

https://gerrit.wikimedia.org/r/883233

Mentioned in SAL (#wikimedia-operations) [2023-01-30T14:47:40Z] <moritzm> updating puppetdb 7 hosts to 7.12.1 T321783

Change 874891 merged by Jbond:

[operations/puppet@production] monitoring: convert prometheus-puppet-agent-stats to pathlib

https://gerrit.wikimedia.org/r/874891

Change 886895 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] postgresql::user: add documentation and fix minor lint errors

https://gerrit.wikimedia.org/r/886895

Change 886897 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] postgres::user: add hostname support to postgres user define

https://gerrit.wikimedia.org/r/886897

Change 886895 merged by Jbond:

[operations/puppet@production] postgresql::user: add documentation and fix minor lint errors

https://gerrit.wikimedia.org/r/886895

Change 886900 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] puppetdb: use new allowed_hosts paramater to postgresql:user

https://gerrit.wikimedia.org/r/886900

Change 886912 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/puppet@production] postgrers::user::hba: drop hba_label and use title instead

https://gerrit.wikimedia.org/r/886912

Change 886912 merged by Jbond:

[operations/puppet@production] postgrers::user::hba: drop hba_label and use title instead

https://gerrit.wikimedia.org/r/886912

Change 886897 merged by Jbond:

[operations/puppet@production] postgres::user: add hostname support to postgres user define

https://gerrit.wikimedia.org/r/886897

Change 886900 merged by Jbond:

[operations/puppet@production] puppetdb: use new allowed_hosts paramater to postgresql:user

https://gerrit.wikimedia.org/r/886900

Icinga downtime and Alertmanager silence (ID=b1f3dbef-467c-49de-8608-5ba564efbe81) set by jmm@cumin2002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: master is being reimaged

puppetdb2003.codfw.wmnet

Change 887971 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Reset puppetdb1003/2003 to insetup

https://gerrit.wikimedia.org/r/887971

Change 887971 merged by Muehlenhoff:

[operations/puppet@production] Reset puppetdb1003/2003 to insetup

https://gerrit.wikimedia.org/r/887971

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host puppetdb1003.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host puppetdb1003.eqiad.wmnet with OS bullseye completed:

  • puppetdb1003 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202302091140_jmm_991175_puppetdb1003.out
    • Checked BIOS boot parameters are back to normal
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host puppetdb2003.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host puppetdb2003.codfw.wmnet with OS bullseye completed:

  • puppetdb2003 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202302091212_jmm_998112_puppetdb2003.out
    • Checked BIOS boot parameters are back to normal
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2023-02-10T11:05:20Z] <moritzm> upgrade puppetdb[12]003 to bookworm T321783

Change 888218 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Reapply puppetdb role

https://gerrit.wikimedia.org/r/888218

Change 888218 merged by Muehlenhoff:

[operations/puppet@production] Reapply puppetdb role

https://gerrit.wikimedia.org/r/888218

Change 889817 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Enable command_broadcast to the new puppetdb 7 hosts

https://gerrit.wikimedia.org/r/889817

Change 890439 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Switch role::puppetdb to Nginx custom flavour

https://gerrit.wikimedia.org/r/890439

Change 890439 merged by Muehlenhoff:

[operations/puppet@production] Switch role::puppetdb to Nginx custom flavour

https://gerrit.wikimedia.org/r/890439

Change 891487 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] Only set profile::nginx::variant to custom for the new bookworm nodes

https://gerrit.wikimedia.org/r/891487

Change 891487 merged by Muehlenhoff:

[operations/puppet@production] Only set profile::nginx::variant to custom for the new bookworm nodes

https://gerrit.wikimedia.org/r/891487

Change 891489 had a related patch set uploaded (by Muehlenhoff; author: Muehlenhoff):

[operations/puppet@production] nginx: Drop require for the nginx package

https://gerrit.wikimedia.org/r/891489

Change 891489 merged by Muehlenhoff:

[operations/puppet@production] nginx: Drop require for the nginx package

https://gerrit.wikimedia.org/r/891489

Change 889817 abandoned by Muehlenhoff:

[operations/puppet@production] Enable command_broadcast to the new puppetdb 7 hosts

Reason:

No longer needed, new plan outlined at https://phabricator.wikimedia.org/T330490

https://gerrit.wikimedia.org/r/889817

Change 869199 abandoned by Muehlenhoff:

[operations/puppet@production] httpd: Let Puppet pick the init provider

Reason:

Already merged in 11feac29ed0910a7b54ad506826dc2568281ff7d

https://gerrit.wikimedia.org/r/869199

Change 912835 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] package_builder: add hooks for bookworm

https://gerrit.wikimedia.org/r/912835

Change 912835 merged by Jbond:

[operations/puppet@production] package_builder: add hooks for bookworm

https://gerrit.wikimedia.org/r/912835

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host puppetdb2003.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host puppetdb2003.codfw.wmnet with OS bookworm completed:

  • puppetdb2003 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202305300811_jmm_3834209_puppetdb2003.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by jmm@cumin2002 for host puppetdb1003.eqiad.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by jmm@cumin2002 for host puppetdb1003.eqiad.wmnet with OS bookworm executed with errors:

  • puppetdb1003 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202305300958_jmm_3953981_puppetdb1003.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • The reimage failed, see the cookbook logs for the details

Change 935478 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] puppetdb2003: move to new role

https://gerrit.wikimedia.org/r/935478

Change 935478 merged by Jbond:

[operations/puppet@production] puppetdb2003: move to new role

https://gerrit.wikimedia.org/r/935478

Change 935480 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] pupetserver::git: ensure we build all parent directories

https://gerrit.wikimedia.org/r/935480

Change 935480 merged by Jbond:

[operations/puppet@production] pupetserver::git: ensure we build all parent directories

https://gerrit.wikimedia.org/r/935480

Change 935483 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] pupetserver::git: Ensure we build the git repo if using init

https://gerrit.wikimedia.org/r/935483

Change 935483 merged by Jbond:

[operations/puppet@production] pupetserver::git: Ensure we build the git repo if using init

https://gerrit.wikimedia.org/r/935483

Change 935507 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] puppetserver: codfw update puppetdb url

https://gerrit.wikimedia.org/r/935507

Change 935507 merged by Jbond:

[operations/puppet@production] puppetserver: codfw update puppetdb url

https://gerrit.wikimedia.org/r/935507

Change 935509 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] puppetmaster: add new puppetsrver

https://gerrit.wikimedia.org/r/935509

Change 935509 merged by Jbond:

[operations/puppet@production] puppetmaster: add new puppetsrver

https://gerrit.wikimedia.org/r/935509

Change 935514 had a related patch set uploaded (by Jbond; author: jbond):

[operations/puppet@production] puppetdb: add classification back

https://gerrit.wikimedia.org/r/935514

Change 935514 merged by Jbond:

[operations/puppet@production] puppetdb: add classification back

https://gerrit.wikimedia.org/r/935514

jbond claimed this task.

We now have puppetserver, db and puppetboard running on both codfw and eqiad