Page MenuHomePhabricator

Upgrade sanitarium hosts to MariaDB 10.6
Closed, ResolvedPublic

Description

After finishing clouddb hosts (T356838) sanitarium hosts can now be upgraded

  • db1154
  • db1155
  • db2186
  • db2187

Details

Related Changes in Gerrit:

Event Timeline

Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to In progress on the DBA board.

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1002 for host db2186.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1002 for host db2187.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1002 for host db2186.codfw.wmnet with OS bookworm completed:

  • db2186 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202402280720_marostegui_3431477_db2186.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1002 for host db2187.codfw.wmnet with OS bookworm completed:

  • db2187 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202402280731_marostegui_3433600_db2187.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change 1012467 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1154: Migrate to mariadb 10.6

https://gerrit.wikimedia.org/r/1012467

Change 1012467 merged by Marostegui:

[operations/puppet@production] db1154: Migrate to mariadb 10.6

https://gerrit.wikimedia.org/r/1012467

Marostegui updated the task description. (Show Details)

All done