We are currently running OpenStack version 'Victoria'. Victoria is the release that is packaged for both Buster and Bullseye; for future OpenStack upgrades we will need our control plane on Bullseye.
Description
Details
Event Timeline
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:18:40Z] <wm-bot> Set cloudvirt 'cloudvirt1030.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:19:23Z] <wm-bot> Draining 'cloudvirt1031.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:20:08Z] <wm-bot> Set cloudvirt 'cloudvirt1031.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:32:55Z] <wm-bot> Drained 'cloudvirt1030.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1030.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:34:24Z] <wm-bot> Draining 'cloudvirt1032.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:35:09Z] <wm-bot> Set cloudvirt 'cloudvirt1032.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:44:48Z] <wm-bot> Drained 'cloudvirt1031.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1031.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T14:57:46Z] <wm-bot> Draining 'cloudvirt1032.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1030.eqiad.wmnet with OS bullseye completed:
- cloudvirt1030 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231433_andrew_2889456_cloudvirt1030.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T15:00:40Z] <wm-bot> Set cloudvirt 'cloudvirt1032.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T15:01:48Z] <wm-bot> Drained 'cloudvirt1032.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1031.eqiad.wmnet with OS bullseye completed:
- cloudvirt1031 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231445_andrew_2892140_cloudvirt1031.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1032.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1032.eqiad.wmnet with OS bullseye completed:
- cloudvirt1032 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231550_andrew_2903853_cloudvirt1032.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T16:36:29Z] <wm-bot> Draining 'cloudvirt1033.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T16:36:38Z] <wm-bot> Draining 'cloudvirt1034.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T16:37:14Z] <wm-bot> Set cloudvirt 'cloudvirt1033.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T16:37:22Z] <wm-bot> Set cloudvirt 'cloudvirt1034.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1047.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T16:51:05Z] <wm-bot> Drained 'cloudvirt1033.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1028.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1047.eqiad.wmnet with OS bullseye executed with errors:
- cloudvirt1047 (FAIL)
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1033.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T17:03:12Z] <wm-bot> Drained 'cloudvirt1034.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T17:03:43Z] <wm-bot> Draining 'cloudvirt1035.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T17:04:29Z] <wm-bot> Set cloudvirt 'cloudvirt1035.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1034.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1028.eqiad.wmnet with OS bullseye completed:
- cloudvirt1028 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231658_andrew_2915140_cloudvirt1028.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1033.eqiad.wmnet with OS bullseye completed:
- cloudvirt1033 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231659_andrew_2915254_cloudvirt1033.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1034.eqiad.wmnet with OS bullseye completed:
- cloudvirt1034 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231707_andrew_2917902_cloudvirt1034.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T17:54:33Z] <wm-bot> Draining 'cloudvirt1036.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T17:55:18Z] <wm-bot> Set cloudvirt 'cloudvirt1036.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T18:13:17Z] <wm-bot> Drained 'cloudvirt1036.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T18:18:20Z] <wm-bot> Draining 'cloudvirt1037.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T18:19:04Z] <wm-bot> Set cloudvirt 'cloudvirt1037.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1035.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1036.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T18:43:54Z] <wm-bot> Draining 'cloudvirt1038.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T18:44:39Z] <wm-bot> Set cloudvirt 'cloudvirt1038.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1035.eqiad.wmnet with OS bullseye completed:
- cloudvirt1035 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231836_andrew_2933834_cloudvirt1035.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1036.eqiad.wmnet with OS bullseye completed:
- cloudvirt1036 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203231836_andrew_2933843_cloudvirt1036.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1037.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1038.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T20:14:50Z] <wm-bot> Draining 'cloudvirt1039.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T20:14:57Z] <wm-bot> Draining 'cloudvirt1040.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T20:15:34Z] <wm-bot> Set cloudvirt 'cloudvirt1039.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T20:15:43Z] <wm-bot> Set cloudvirt 'cloudvirt1040.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T20:30:30Z] <wm-bot> Drained 'cloudvirt1039.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1039.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1038.eqiad.wmnet with OS bullseye completed:
- cloudvirt1038 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232014_andrew_2948775_cloudvirt1038.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1037.eqiad.wmnet with OS bullseye completed:
- cloudvirt1037 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232014_andrew_2948781_cloudvirt1037.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T20:54:42Z] <wm-bot> Draining 'cloudvirt1041.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T20:55:29Z] <wm-bot> Set cloudvirt 'cloudvirt1041.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:04:41Z] <wm-bot> Draining 'cloudvirt1040.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:07:37Z] <wm-bot> Set cloudvirt 'cloudvirt1040.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1039.eqiad.wmnet with OS bullseye completed:
- cloudvirt1039 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232031_andrew_2950344_cloudvirt1039.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:09:11Z] <wm-bot> Draining 'cloudvirt1040.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:12:05Z] <wm-bot> Set cloudvirt 'cloudvirt1040.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:12:09Z] <wm-bot> Drained 'cloudvirt1040.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1040.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:19:08Z] <wm-bot> Draining 'cloudvirt1042.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:19:53Z] <wm-bot> Set cloudvirt 'cloudvirt1042.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T21:54:23Z] <wm-bot> Drained 'cloudvirt1042.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1040.eqiad.wmnet with OS bullseye completed:
- cloudvirt1040 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232118_andrew_2962276_cloudvirt1040.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1041.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1042.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:06:20Z] <wm-bot> Draining 'cloudvirt1043.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:06:25Z] <wm-bot> Draining 'cloudvirt1044.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:07:11Z] <wm-bot> Set cloudvirt 'cloudvirt1044.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:08:10Z] <wm-bot> Set cloudvirt 'cloudvirt1043.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:12:04Z] <wm-bot> Draining 'cloudvirt1045.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:12:50Z] <wm-bot> Set cloudvirt 'cloudvirt1045.eqiad.wmnet' maintenance. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1043.eqiad.wmnet with OS bullseye
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:38:55Z] <wm-bot> Drained 'cloudvirt1044.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1041.eqiad.wmnet with OS bullseye completed:
- cloudvirt1041 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232205_andrew_2970731_cloudvirt1041.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1042.eqiad.wmnet with OS bullseye completed:
- cloudvirt1042 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232205_andrew_2970762_cloudvirt1042.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Mentioned in SAL (#wikimedia-cloud) [2022-03-23T22:53:55Z] <wm-bot> Drained 'cloudvirt1045.eqiad.wmnet'. (T281276) - cookbook ran by andrew@buster
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1043.eqiad.wmnet with OS bullseye completed:
- cloudvirt1043 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232235_andrew_2977736_cloudvirt1043.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1045.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1044.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1046.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1045.eqiad.wmnet with OS bullseye completed:
- cloudvirt1045 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232348_andrew_2986471_cloudvirt1045.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1044.eqiad.wmnet with OS bullseye completed:
- cloudvirt1044 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232348_andrew_2986467_cloudvirt1044.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1046.eqiad.wmnet with OS bullseye completed:
- cloudvirt1046 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203232351_andrew_2986809_cloudvirt1046.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt-wdqs1002.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt-wdqs1002.eqiad.wmnet with OS bullseye executed with errors:
- cloudvirt-wdqs1002 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details
Quick update: The only hosts remaining to upgrade to bullseye are the toolsdb-hosting hypervisors (cloudvirt1019 and 1020) and cloudvirt-wdqs1xxx hosts.
I'm waiting for a consult with data-persistence about the toolsdb hosts.
The cloudvirt-wdqs services are just waiting for a job to complete that's running on one of the hosted VMs.
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1020.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1020.eqiad.wmnet with OS bullseye completed:
- cloudvirt1020 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204181525_andrew_2870264_cloudvirt1020.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1001 for host cloudvirt1019.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1001 for host cloudvirt1019.eqiad.wmnet with OS bullseye completed:
- cloudvirt1019 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present
- Deleted any existing Puppet certificate
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204191534_andrew_3615519_cloudvirt1019.out
- Checked BIOS boot parameters are back to normal
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB