Description
Details
Related Objects
Event Timeline
Change #1014057 had a related patch set uploaded (by Cwhite; author: Cwhite):
[operations/puppet@production] beta-logs: add ssd-0[123] host configs
Change #1014057 merged by Cwhite:
[operations/puppet@production] beta-logs: add ssd-0[123] host configs
Change #1014062 had a related patch set uploaded (by Cwhite; author: Cwhite):
[operations/puppet@production] beta-logs: replace logging-logstash-01 with -03
Change #1014062 merged by Cwhite:
[operations/puppet@production] beta-logs: replace logging-logstash-01 with -03
Change #1014063 had a related patch set uploaded (by Cwhite; author: Cwhite):
[operations/puppet@production] logstash: enable openjdk-17 support
Change #1014063 merged by Cwhite:
[operations/puppet@production] logstash: enable openjdk-17 support
Change #1014064 had a related patch set uploaded (by Cwhite; author: Cwhite):
[operations/puppet@production] logstash: introduce java_package option
Change #1014064 merged by Cwhite:
[operations/puppet@production] logstash: introduce java_package option
Change #1014664 had a related patch set uploaded (by Cwhite; author: Cwhite):
[operations/puppet@production] beta-logs: move jobs host duties to logging-logstash-03
Change #1014664 merged by Cwhite:
[operations/puppet@production] beta-logs: move jobs host duties to logging-logstash-03
Cookbook cookbooks.sre.hosts.reimage was started by cwhite@cumin2002 for host logging-hd2001.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by cwhite@cumin2002 for host logging-hd2001.codfw.wmnet with OS bookworm executed with errors:
- logging-hd2001 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" logging-hd2001.codfw.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by cwhite@cumin2002 for host logging-hd2001.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by cwhite@cumin2002 for host logging-hd2001.codfw.wmnet with OS bookworm executed with errors:
- logging-hd2001 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" logging-hd2001.codfw.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by cwhite@cumin2002 for host logging-hd2001.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by cwhite@cumin2002 for host logging-hd2001.codfw.wmnet with OS bookworm completed:
- logging-hd2001 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202404022205_cwhite_3900766_logging-hd2001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage was started by cwhite@cumin2002 for host logging-hd2003.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by cwhite@cumin2002 for host logging-hd2003.codfw.wmnet with OS bookworm executed with errors:
- logging-hd2003 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" logging-hd2003.codfw.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by cwhite@cumin2002 for host logging-hd2003.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by cwhite@cumin2002 for host logging-hd2002.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by cwhite@cumin2002 for host logging-hd2003.codfw.wmnet with OS bookworm completed:
- logging-hd2003 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202404030003_cwhite_4011899_logging-hd2003.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by cwhite@cumin2002 for host logging-hd2002.codfw.wmnet with OS bookworm executed with errors:
- logging-hd2002 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" logging-hd2002.codfw.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by cwhite@cumin2002 for host logging-hd2002.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by cwhite@cumin2002 for host logging-hd2002.codfw.wmnet with OS bookworm completed:
- logging-hd2002 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202404030119_cwhite_4083457_logging-hd2002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB