Bot managed by SRE for automated interaction with Phabricator from monitoring tools.
User Details
- User Since
- Aug 12 2016, 1:45 PM (407 w, 2 d)
- Roles
- Bot
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Unknown
Today
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc-gp1001.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc-gp1002.eqiad.wmnet with OS bookworm completed:
- mc-gp1002 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406031224_jiji_270944_mc-gp1002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc-gp1002.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1037.eqiad.wmnet with OS bookworm completed:
- mc1037 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406031107_jiji_262393_mc1037.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1037.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1038.eqiad.wmnet with OS bookworm completed:
- mc1038 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406031021_jiji_254118_mc1038.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by btullis@cumin1002 for host snapshot1013.eqiad.wmnet with OS bullseye completed:
- snapshot1013 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406031002_btullis_250840_snapshot1013.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1038.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc-gp2001.codfw.wmnet with OS bookworm executed with errors:
- mc-gp2001 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406030931_jiji_3618044_mc-gp2001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" mc-gp2001.codfw.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc-gp2001.codfw.wmnet with OS bookworm completed:
- mc-gp2001 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406030931_jiji_3618044_mc-gp2001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage was started by btullis@cumin1002 for host snapshot1013.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1039.eqiad.wmnet with OS bookworm completed:
- mc1039 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406030925_jiji_245315_mc1039.out
- Unable to run puppet on config-master2001.codfw.wmnet,config-master1001.eqiad.wmnet to update configmaster.wikimedia.org with the new host SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc-gp2001.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1039.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc-gp2002.codfw.wmnet with OS bookworm completed:
- mc-gp2002 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406030831_jiji_3559775_mc-gp2002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc-gp1003.eqiad.wmnet with OS bookworm completed:
- mc-gp1003 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406030828_jiji_236110_mc-gp1003.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc-gp1003.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc-gp2002.codfw.wmnet with OS bookworm
Sat, Jun 1
Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye completed:
- kafka-main1010 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202406010720_akosiaris_4078484_kafka-main1010.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Fri, May 31
Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1002 for host db1209.eqiad.wmnet with OS bookworm completed:
- db1209 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405311209_marostegui_3930866_db1209.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2038.codfw.wmnet with OS bookworm completed:
- mc2038 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405311203_jiji_3776530_mc2038.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1002 for host db1209.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1039.eqiad.wmnet with OS bookworm executed with errors:
- mc1039 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- New OS is bullseye but bookworm was requested
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" mc1039.eqiad.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2038.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2039.codfw.wmnet with OS bookworm completed:
- mc2039 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405311109_jiji_3721618_mc2039.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2039.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1039.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2040.codfw.wmnet with OS bookworm completed:
- mc2040 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405311003_jiji_3657011_mc2040.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1040.eqiad.wmnet with OS bookworm completed:
- mc1040 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405310958_jiji_3912762_mc1040.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2040.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1040.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2041.codfw.wmnet with OS bookworm completed:
- mc2041 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405310908_jiji_3601887_mc2041.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1041.eqiad.wmnet with OS bookworm completed:
- mc1041 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405310903_jiji_3902367_mc1041.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2041.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1041.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2042.codfw.wmnet with OS bookworm completed:
- mc2042 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405310802_jiji_3536165_mc2042.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1042.eqiad.wmnet with OS bookworm completed:
- mc1042 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405310756_jiji_3893081_mc1042.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2042.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1042.eqiad.wmnet with OS bookworm
Icinga downtime and Alertmanager silence (ID=418e7ac9-ef32-4b52-9f19-645af27090e2) set by mvernon@cumin1002 for 7 days, 0:00:00 on 1 host(s) and their services with reason: in development
moss-fe1002.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2043.codfw.wmnet with OS bookworm completed:
- mc2043 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405310641_jiji_3456016_mc2043.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1043.eqiad.wmnet with OS bookworm completed:
- mc1043 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405310636_jiji_3881045_mc1043.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1043.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2043.codfw.wmnet with OS bookworm
Thu, May 30
Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye completed:
- kafka-main1009 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405301558_akosiaris_3731074_kafka-main1009.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye executed with errors:
- kafka-main1010 (FAIL)
- Downtimed on Icinga/Alertmanager
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" kafka-main1010.eqiad.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2044.codfw.wmnet with OS bookworm completed:
- mc2044 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405301325_jiji_2436033_mc2044.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1044.eqiad.wmnet with OS bookworm completed:
- mc1044 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405301320_jiji_3707785_mc1044.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1044.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2044.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1002 for host db1215.eqiad.wmnet with OS bookworm completed:
- db1215 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405300905_marostegui_3669917_db1215.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1002 for host db1215.eqiad.wmnet with OS bookworm
Wed, May 29
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye completed:
- kafka-main1009 (WARN)
- Downtimed on Icinga/Alertmanager
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run failed and logged in /var/log/spicerack/sre/hosts/reimage/202405292206_jclark_3589235_kafka-main1009.out, asking the operator what to do
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405292206_jclark_3589235_kafka-main1009.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye completed:
- kafka-main1009 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291635_jclark_3540615_kafka-main1009.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- Failed to run the sre.puppet.sync-netbox-hiera cookbook, run it manually
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudvirt1041.eqiad.wmnet with OS bookworm completed:
- cloudvirt1041 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291726_andrew_3548421_cloudvirt1041.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
- Updated Netbox status failed -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudvirt1041.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2045.codfw.wmnet with OS bookworm completed:
- mc2045 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291622_jiji_1210815_mc2045.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1045.eqiad.wmnet with OS bookworm completed:
- mc1045 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291617_jiji_3536831_mc1045.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1045.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2045.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudvirt1041.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye executed with errors:
- kafka-main1009 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" kafka-main1009.eqiad.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by arnaudb@cumin1002 for host db1163.eqiad.wmnet with OS bookworm completed:
- db1163 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291447_arnaudb_3519736_db1163.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2046.codfw.wmnet with OS bookworm completed:
- mc2046 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291449_jiji_1115869_mc2046.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1046.eqiad.wmnet with OS bookworm completed:
- mc1046 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291442_jiji_3517157_mc1046.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by arnaudb@cumin1002 for host db1163.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2046.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1046.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by arnaudb@cumin1002 for host db1169.eqiad.wmnet with OS bookworm completed:
- db1169 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291346_arnaudb_3509654_db1169.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2047.codfw.wmnet with OS bookworm completed:
- mc2047 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291316_jiji_1024686_mc2047.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by arnaudb@cumin1002 for host db1169.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1047.eqiad.wmnet with OS bookworm completed:
- mc1047 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291310_jiji_3499598_mc1047.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by arnaudb@cumin1002 for host db1196.eqiad.wmnet with OS bookworm completed:
- db1196 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291304_arnaudb_3499111_db1196.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
- Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2047.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1047.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by arnaudb@cumin1002 for host db1196.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin2002 for host mc2048.codfw.wmnet with OS bookworm completed:
- mc2048 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291228_jiji_975311_mc2048.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1048.eqiad.wmnet with OS bookworm completed:
- mc1048 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291224_jiji_3492876_mc1048.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1048.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin2002 for host mc2048.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jiji@cumin1002 for host mc1049.eqiad.wmnet with OS bookworm completed:
- mc1049 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202405291046_jiji_3479252_mc1049.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jiji@cumin1002 for host mc1049.eqiad.wmnet with OS bookworm
Icinga downtime and Alertmanager silence (ID=2f1b90d9-2cd4-4705-bbf1-70fdacf169cd) set by akosiaris@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: disable puppet and k8s controlplane
kubemaster1002.eqiad.wmnet
Icinga downtime and Alertmanager silence (ID=8fa8366a-d3f2-4a77-8e2b-45de66551026) set by akosiaris@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: disable puppet and k8s controlplane
wikikube-ctrl1002.eqiad.wmnet
Icinga downtime and Alertmanager silence (ID=b65d2df8-871b-4064-b329-026af4d7ec1d) set by akosiaris@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: disable puppet and k8s controlplane
wikikube-ctrl1001.eqiad.wmnet
Tue, May 28
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye executed with errors:
- kafka-main1009 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata to Debian installer
- Checked BIOS boot parameters are back to normal
- The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" kafka-main1009.eqiad.wmnet to get a root shellbut depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1009.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1002 for host kafka-main1010.eqiad.wmnet with OS bullseye