Page MenuHomePhabricator

Migrate restbase servers to Bullseye
Closed, ResolvedPublic

Description

These are still on Buster.

Bullseye no longer includes Python 2, so this needs T313814 resolved first.


  • eqiad
    • restbase1016.eqiad.wmnet (decommissioned)
    • restbase1019.eqiad.wmnet
    • restbase1020.eqiad.wmnet
    • restbase1021.eqiad.wmnet
    • restbase1028.eqiad.wmnet
    • restbase1031.eqiad.wmnet
    • restbase1017.eqiad.wmnet (decommissioned)
    • restbase1022.eqiad.wmnet
    • restbase1023.eqiad.wmnet
    • restbase1024.eqiad.wmnet
    • restbase1029.eqiad.wmnet
    • restbase1032.eqiad.wmnet
    • restbase1018.eqiad.wmnet (decommissioned)
    • restbase1025.eqiad.wmnet
    • restbase1026.eqiad.wmnet
    • restbase1027.eqiad.wmnet
    • restbase1030.eqiad.wmnet
    • restbase1033.eqiad.wmnet
  • codfw
    • restbase2013.codfw.wmnet
    • restbase2014.codfw.wmnet
    • restbase2019.codfw.wmnet
    • restbase2021.codfw.wmnet
    • restbase2024.codfw.wmnet
    • restbase2015.codfw.wmnet
    • restbase2016.codfw.wmnet
    • restbase2020.codfw.wmnet
    • restbase2022.codfw.wmnet
    • restbase2025.codfw.wmnet
    • restbase2012.codfw.wmnet (decommissioned)
    • restbase2017.codfw.wmnet
    • restbase2018.codfw.wmnet
    • restbase2023.codfw.wmnet
    • restbase2026.codfw.wmnet
    • restbase2027.codfw.wmnet

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1019.eqiad.wmnet with OS bullseye completed:

  • restbase1019 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309271658_eevans_668600_restbase1019.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase2026.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase2026.codfw.wmnet with OS bullseye completed:

  • restbase2026 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309271827_eevans_712875_restbase2026.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Eevans updated the task description. (Show Details)
Eevans updated the task description. (Show Details)

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase2027.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase2027.codfw.wmnet with OS bullseye completed:

  • restbase2027 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309271924_eevans_742451_restbase2027.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Eevans updated the task description. (Show Details)

Change 962048 had a related patch set uploaded (by Eevans; author: Eevans):

[operations/puppet@production] install_server: utilize reuse recipe for restbase2027

https://gerrit.wikimedia.org/r/962048

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1020.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1020.eqiad.wmnet with OS bullseye completed:

  • restbase1020 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310021354_eevans_4095129_restbase1020.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1021.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1021.eqiad.wmnet with OS bullseye completed:

  • restbase1021 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310021437_eevans_4116560_restbase1021.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1028.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1028.eqiad.wmnet with OS bullseye completed:

  • restbase1028 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310021543_eevans_4152248_restbase1028.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1031.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1031.eqiad.wmnet with OS bullseye completed:

  • restbase1031 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310021629_eevans_4175765_restbase1031.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1022.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1022.eqiad.wmnet with OS bullseye completed:

  • restbase1022 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310021712_eevans_5464_restbase1022.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1023.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1023.eqiad.wmnet with OS bullseye completed:

  • restbase1023 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310021816_eevans_37676_restbase1023.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1024.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1024.eqiad.wmnet with OS bullseye completed:

  • restbase1024 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310021919_eevans_69932_restbase1024.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye executed with errors:

  • restbase1029 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye executed with errors:

  • restbase1029 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • The reimage failed, see the cookbook logs for the details

Change 962693 had a related patch set uploaded (by Eevans; author: Eevans):

[operations/puppet@production] install_server: add restbase1029 (as 3-ssd reuse)

https://gerrit.wikimedia.org/r/962693

Change 962693 merged by Eevans:

[operations/puppet@production] install_server: add restbase1029 (as 3-ssd reuse)

https://gerrit.wikimedia.org/r/962693

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye executed with errors:

  • restbase1029 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye executed with errors:

  • restbase1029 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye executed with errors:

  • restbase1029 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye executed with errors:

  • restbase1029 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1029.eqiad.wmnet with OS bullseye completed:

  • restbase1029 (PASS)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310022057_eevans_119030_restbase1029.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1032.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1032.eqiad.wmnet with OS bullseye executed with errors:

  • restbase1032 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1032.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1032.eqiad.wmnet with OS bullseye completed:

  • restbase1032 (PASS)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310022246_eevans_173583_restbase1032.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Eevans updated the task description. (Show Details)

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1025.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1025.eqiad.wmnet with OS bullseye completed:

  • restbase1025 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310031307_eevans_593083_restbase1025.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1033.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1033.eqiad.wmnet with OS bullseye completed:

  • restbase1033 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310031401_eevans_623576_restbase1033.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1026.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1026.eqiad.wmnet with OS bullseye completed:

  • restbase1026 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310031540_eevans_677743_restbase1026.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by eevans@cumin1001 for host restbase1027.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by eevans@cumin1001 for host restbase1027.eqiad.wmnet with OS bullseye completed:

  • restbase1027 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202310031652_eevans_714914_restbase1027.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Change 962048 merged by Eevans:

[operations/puppet@production] install_server: utilize reuse recipe for restbase2027

https://gerrit.wikimedia.org/r/962048

Eevans updated the task description. (Show Details)

macro-deployed