Page MenuHomePhabricator

Migrate s7 section to Debian Trixie
Closed, ResolvedPublic

Description

  • dbstore1008 T422778
  • db2222
  • db2221
  • db2220
  • db2218
  • db2208 hw issues T425516
  • db2200 T424541
  • db2198 T424541
  • db2182
  • db2168
  • db2159
  • db2150
  • db1253
  • db1236
  • db1231
  • db1227
  • db1202
  • db1194
  • db1191
  • db1181
  • db1174
  • db1171 T424541
  • db1170
  • db1158
  • db1155
  • clouddb1018 T415165
  • clouddb1014 T415165
  • an-redacteddb1001 T422778

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2208.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2208.codfw.wmnet with OS trixie completed:

  • db2208 (WARN)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605070509_marostegui_918883_db2208.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change #1284330 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db2208: Enable notifications

https://gerrit.wikimedia.org/r/1284330

Change #1284330 merged by Marostegui:

[operations/puppet@production] db2208: Enable notifications

https://gerrit.wikimedia.org/r/1284330

Change #1284563 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1202,db2182: Disable notifications

https://gerrit.wikimedia.org/r/1284563

Change #1284563 merged by Marostegui:

[operations/puppet@production] db1202,db2182: Disable notifications

https://gerrit.wikimedia.org/r/1284563

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1202.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2182.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1202.eqiad.wmnet with OS trixie completed:

  • db1202 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605070914_marostegui_974609_db1202.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2182.codfw.wmnet with OS trixie completed:

  • db2182 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605070918_marostegui_974708_db2182.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change #1284589 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1227,db2168: Disable notifications

https://gerrit.wikimedia.org/r/1284589

Change #1284589 merged by Marostegui:

[operations/puppet@production] db1227,db2168: Disable notifications

https://gerrit.wikimedia.org/r/1284589

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1227.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2168.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1227.eqiad.wmnet with OS trixie completed:

  • db1227 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605071048_marostegui_1048837_db1227.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2168.codfw.wmnet with OS trixie completed:

  • db2168 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605071055_marostegui_1051628_db2168.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change #1285006 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db2159: Disable notifications

https://gerrit.wikimedia.org/r/1285006

Change #1285006 merged by Marostegui:

[operations/puppet@production] db2159: Disable notifications

https://gerrit.wikimedia.org/r/1285006

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2159.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2159.codfw.wmnet with OS trixie completed:

  • db2159 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605080551_marostegui_1282798_db2159.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change #1286168 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1231,db2150: Disable notifications

https://gerrit.wikimedia.org/r/1286168

Change #1286168 merged by Marostegui:

[operations/puppet@production] db1231,db2150: Disable notifications

https://gerrit.wikimedia.org/r/1286168

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2150.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1231.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1231.eqiad.wmnet with OS trixie completed:

  • db1231 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605120704_marostegui_2951101_db1231.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2150.codfw.wmnet with OS trixie completed:

  • db2150 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605120708_marostegui_2951065_db2150.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change #1286735 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1253,db2218: Disable notifications

https://gerrit.wikimedia.org/r/1286735

Change #1286735 merged by Marostegui:

[operations/puppet@production] db1253,db2218: Disable notifications

https://gerrit.wikimedia.org/r/1286735

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2218.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1253.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1253.eqiad.wmnet with OS trixie completed:

  • db1253 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605130559_marostegui_3391454_db1253.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2218.codfw.wmnet with OS trixie completed:

  • db2218 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605130603_marostegui_3391410_db2218.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change #1286845 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db2220: Disable notifications

https://gerrit.wikimedia.org/r/1286845

Change #1286845 merged by Marostegui:

[operations/puppet@production] db2220: Disable notifications

https://gerrit.wikimedia.org/r/1286845

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2220.codfw.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2220.codfw.wmnet with OS trixie completed:

  • db2220 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605131035_marostegui_3432323_db2220.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Completed depooling of db1236 by fceratto@cumin1003: Upgrading db1236.eqiad.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by fceratto@cumin1003 for host db1236.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by fceratto@cumin1003 for host db1236.eqiad.wmnet with OS trixie completed:

  • db1236 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605131125_fceratto_3444723_db1236.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Starting pool of db1236 by fceratto@cumin1003: Migration of db1236.eqiad.wmnet completed

Completed pooling of db1236 by fceratto@cumin1003: Migration of db1236.eqiad.wmnet completed

Migration of db1236.eqiad.wmnet completed

Change #1287080 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1158: Disable notifications

https://gerrit.wikimedia.org/r/1287080

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1158.eqiad.wmnet with OS trixie

Change #1287080 merged by Marostegui:

[operations/puppet@production] db1158: Disable notifications

https://gerrit.wikimedia.org/r/1287080

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1158.eqiad.wmnet with OS trixie completed:

  • db1158 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202605140529_marostegui_3741973_db1158.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change #1296249 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] db1181: Disable notifications

https://gerrit.wikimedia.org/r/1296249

Change #1296249 merged by Marostegui:

[operations/puppet@production] db1181: Disable notifications

https://gerrit.wikimedia.org/r/1296249

Completed depooling of db1181 by marostegui@cumin1003: Upgrading db1181.eqiad.wmnet

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1181.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1181.eqiad.wmnet with OS trixie completed:

  • db1181 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202606020629_marostegui_3870080_db1181.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Starting pool of db1181 by marostegui@cumin1003: Migration of db1181.eqiad.wmnet completed

Marostegui updated the task description. (Show Details)

All done

Completed pooling of db1181 by marostegui@cumin1003: Migration of db1181.eqiad.wmnet completed

Migration of db1181.eqiad.wmnet completed

Cookbook cookbooks.sre.hosts.reimage was started by fceratto@cumin1003 for host db1215.eqiad.wmnet with OS trixie

Cookbook cookbooks.sre.hosts.reimage started by fceratto@cumin1003 for host db1215.eqiad.wmnet with OS trixie completed:

  • db1215 (WARN)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202606100744_fceratto_2619557_db1215.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by fceratto@cumin1003 for host db1215.eqiad.wmnet with OS trixie executed with errors:

  • db1215 (FAIL)
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202606100744_fceratto_2619557_db1215.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Skipping waiting for Icinga optimal status and not removing the downtime, --no-check-icinga was set
    • Updated Netbox data from PuppetDB
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console db1215.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.

Migration of db1215.eqiad.wmnet completed