Page MenuHomePhabricator

Migrate dbstore* hosts to 10.6
Closed, ResolvedPublic

Description

10.4 wil be EOL in June 2024
Please migrate dbstore hosts to 10.6 (either reimagining to Bookworm - which ships 10.6 by default, or by upgrading the packages).

  • dbstore1007
  • dbstore1008
  • dbstore1009

Related Objects

Event Timeline

@BTullis can you upgrade dbstore1007 soon please? That might block our upgrades in s2, s3 and s4 in the near future (not blocked as of today)
If you want me to do it, let me know.

Mentioned in SAL (#wikimedia-analytics) [2024-02-28T11:08:29Z] <btullis> reimaging dbstore1007 to bookworm for T356961

Don't forget to run mysql_upgrade for each section (mysql_upgrade -S $PATH_TO_SOCKET_LOCATIOn)

Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1002 for host dbstore1007.eqiad.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1002 for host dbstore1007.eqiad.wmnet with OS bookworm completed:

  • dbstore1007 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202402281131_btullis_3466383_dbstore1007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1002 for host dbstore1007.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1002 for host dbstore1007.eqiad.wmnet with OS bullseye completed:

  • dbstore1007 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202402281216_btullis_3474448_dbstore1007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1002 for host dbstore1007.eqiad.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1002 for host dbstore1007.eqiad.wmnet with OS bookworm completed:

  • dbstore1007 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202402281318_btullis_3483335_dbstore1007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Don't forget to run mysql_upgrade for each section (mysql_upgrade -S $PATH_TO_SOCKET_LOCATIOn)

I have done this as it wasn't completed

All done now. Upgraded to bookworm. I had to reimage twice, because I didn't shut down each mariadb section cleanly before the first attempt at a reimage.

I had thought that the IPMI reboot would have stopped the systemd units cleanly during shutdown, but that didn't happen and it refused to start with a 'crashed' version of 10.4.

Don't forget to run mysql_upgrade for each section (mysql_upgrade -S $PATH_TO_SOCKET_LOCATIOn)

I have done this as it wasn't completed

Ah, many thanks. I hadn't spotted that comment.

Thanks for getting this done