Cloud VPS Project Tested: N/A, established tech
Site/Location: magru
Number of systems: 2
Service: ncredir
Networking Requirements: external IP, for public traffic rerouting
Processor Requirements: 2
Memory: 4G
Disks: 20G
Other Requirements:
Description
Description
Event Timeline
Comment Actions
Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir7001.magru.wmnet with OS bookworm
Comment Actions
Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir7001.magru.wmnet with OS bookworm executed with errors:
- ncredir7001 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via gnt-instance
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Set boot media to disk
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" ncredir7001.magru.wmnet to get a root shellbut depending on the failure this may not work.
Comment Actions
Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host ncredir7002.magru.wmnet with OS bookworm
Comment Actions
Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host ncredir7002.magru.wmnet with OS bookworm completed:
- ncredir7002 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via gnt-instance
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Set boot media to disk
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405031646_brett_3134898_ncredir7002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB