Bot managed by SRE for automated interaction with Phabricator from monitoring tools.
User Details
- User Since
- Aug 12 2016, 1:45 PM (487 w, 4 d)
- Roles
- Bot
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Unknown
Yesterday
Cookbook cookbooks.sre.hosts.reimage started by cmooney@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:
- es2028 (FAIL)
- Host successfully migrated to the new VLAN
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by cmooney@cumin1003 for host es2028.codfw.wmnet with OS trixie
Icinga downtime and Alertmanager silence (ID=6fbc002b-f58a-4a1e-8c1b-328bb8dc684d) set by jelto@cumin1003 for 0:30:00 on 1 host(s) and their services with reason: Phabricator deploy
phab2002.codfw.wmnet
Icinga downtime and Alertmanager silence (ID=767d0770-34be-4080-b717-6fcd10af3517) set by jelto@cumin1003 for 0:30:00 on 1 host(s) and their services with reason: Phabricator deploy
phab1004.eqiad.wmnet
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:
- es2028 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: puppetmaster1003.eqiad.wmnet
- puppetmaster1003.eqiad.wmnet (PASS)
- Downtimed host on Icinga/Alertmanager
- Found physical host
- Downtimed management interface on Alertmanager
- Wiped all swraid, partition-table and filesystem signatures
- Powered off
- [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
- Configured the linked switch interface(s)
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:
- es2028 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:
- es2028 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie
Host gitlab1003.wikimedia.org rebooted by jelto@cumin1003 with reason: maintenance reboot for new subnets
Host gitlab2002.wikimedia.org rebooted by jelto@cumin1003 with reason: maintenance reboot for new subnets
cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: puppetmaster2002.codfw.wmnet
- puppetmaster2002.codfw.wmnet (PASS)
- Downtimed host on Icinga/Alertmanager
- Found physical host
- Downtimed management interface on Alertmanager
- Wiped all swraid, partition-table and filesystem signatures
- Powered off
- [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
- Configured the linked switch interface(s)
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
Mon, Dec 15
Host gitlab2002.wikimedia.org rebooted by jelto@cumin1003 with reason: maintenance reboot for new subnets
Fri, Dec 12
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm completed:
- dse-k8s-worker2005 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512121730_jhancock_4105277_dse-k8s-worker2005.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie completed:
- ganeti-jumbo1002 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512121404_jclark_4010144_ganeti-jumbo1002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- Failed to run the sre.puppet.sync-netbox-hiera cookbook, run it manually
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1003.eqiad.wmnet with OS trixie completed:
- ganeti-jumbo1003 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512121418_jclark_4024268_ganeti-jumbo1003.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1003.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie executed with errors:
- wdqs1029 (FAIL)
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1029.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1032.eqiad.wmnet with OS trixie executed with errors:
- wdqs1032 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512120914_gehel_3706738_wdqs1032.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1032.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1032.eqiad.wmnet with OS trixie
Thu, Dec 11
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm executed with errors:
- dse-k8s-worker2005 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console dse-k8s-worker2005.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host dse-k8s-worker2004.codfw.wmnet with OS bookworm completed:
- dse-k8s-worker2004 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512112219_jhancock_3621874_dse-k8s-worker2004.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host dse-k8s-worker2004.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1031.eqiad.wmnet with OS trixie executed with errors:
- wdqs1031 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512112034_gehel_3605396_wdqs1031.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1031.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie executed with errors:
- ganeti-jumbo1002 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ganeti-jumbo1002.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1001.eqiad.wmnet with OS trixie completed:
- ganeti-jumbo1001 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512112009_jclark_3604593_ganeti-jumbo1001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1031.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1001.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1030.eqiad.wmnet with OS trixie executed with errors:
- wdqs1030 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111547_gehel_3532056_wdqs1030.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1030.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie executed with errors:
- wdqs1029 (FAIL)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111543_gehel_3531760_wdqs1029.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1029.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1030.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1028.eqiad.wmnet with OS trixie completed:
- wdqs1028 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111454_gehel_3523474_wdqs1028.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1028.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host logging-sd1007.eqiad.wmnet with OS bookworm completed:
- logging-sd1007 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111350_jclark_3507488_logging-sd1007.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host logging-sd1006.eqiad.wmnet with OS bookworm completed:
- logging-sd1006 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111347_jclark_3507482_logging-sd1006.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host logging-sd1005.eqiad.wmnet with OS bookworm completed:
- logging-sd1005 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111343_jclark_3507951_logging-sd1005.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host logging-sd1005.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1025.eqiad.wmnet with OS bullseye completed:
- aqs1025 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111244_jclark_3502152_aqs1025.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host logging-sd1006.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host logging-sd1007.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1025.eqiad.wmnet with OS bullseye
Wed, Dec 10
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host ganeti-jumbo2001.codfw.wmnet with OS trixie completed:
- ganeti-jumbo2001 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102202_jhancock_3400817_ganeti-jumbo2001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host ganeti-jumbo2003.codfw.wmnet with OS trixie completed:
- ganeti-jumbo2003 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102158_jhancock_3400851_ganeti-jumbo2003.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host ganeti-jumbo2002.codfw.wmnet with OS trixie completed:
- ganeti-jumbo2002 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102154_jhancock_3400827_ganeti-jumbo2002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host ganeti-jumbo2003.codfw.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host ganeti-jumbo2002.codfw.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host ganeti-jumbo2001.codfw.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1026.eqiad.wmnet with OS bullseye completed:
- aqs1026 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102044_jclark_3385285_aqs1026.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1024.eqiad.wmnet with OS bullseye completed:
- aqs1024 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102037_jclark_3385277_aqs1024.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1027.eqiad.wmnet with OS bullseye completed:
- aqs1027 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102041_jclark_3385290_aqs1027.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- Failed to run the sre.puppet.sync-netbox-hiera cookbook, run it manually
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1023.eqiad.wmnet with OS bullseye completed:
- aqs1023 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102033_jclark_3383037_aqs1023.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1027.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1026.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1024.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1023.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host lvs1018.eqiad.wmnet with OS bullseye completed:
- lvs1018 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101948_brett_2964783_lvs1018.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host lvs1018.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host logging-sd2005.codfw.wmnet with OS bookworm completed:
- logging-sd2005 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101748_jhancock_3318276_logging-sd2005.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- Failed to run the sre.puppet.sync-netbox-hiera cookbook, run it manually
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host logging-sd2007.codfw.wmnet with OS bookworm completed:
- logging-sd2007 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101717_jhancock_3318763_logging-sd2007.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host logging-sd2006.codfw.wmnet with OS bookworm completed:
- logging-sd2006 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh bookworm OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101659_jhancock_3295799_logging-sd2006.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host logging-sd2007.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host logging-sd2005.codfw.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host logging-sd2006.codfw.wmnet with OS bookworm
Tue, Dec 9
Cookbook cookbooks.sre.hosts.reimage started by cmooney@cumin1003 for host sretest2009.codfw.wmnet with OS trixie completed:
- sretest2009 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512091954_cmooney_2929373_sretest2009.out
- Unable to run puppet on config-master2001.codfw.wmnet,config-master1001.eqiad.wmnet to update configmaster.wikimedia.org with the new host SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by cmooney@cumin1003 for host sretest2009.codfw.wmnet with OS trixie
Icinga downtime and Alertmanager silence (ID=1aee9e7e-d36b-4c56-8cac-746f48098c6f) set by cmooney@cumin1003 for 2:00:00 on 2 host(s) and their services with reason: upgradiing sr-linux on Nokia switches codfw
ssw1-e1-codfw.mgmt,ssw1-f1-codfw.mgmt
Icinga downtime and Alertmanager silence (ID=2a98251c-6798-469c-a3de-57fcfb13969f) set by cmooney@cumin1003 for 2:00:00 on 17 host(s) and their services with reason: upgradiing sr-linux on Nokia switches codfw
lsw1-e[2,4-5]-codfw,lsw1-e[2,4-5]-codfw IPv6,lsw1-e[2,4-5]-codfw.mgmt,lsw1-f[2,4]-codfw,lsw1-f[2,4]-codfw IPv6,lsw1-f[2,4]-codfw.mgmt,ssw1-e1-codfw,ssw1-f1-codfw
Completed pool of db1229 gradually with 4 steps - Pooling in after cloning - fceratto@cumin1003
Start pool of db1229 gradually with 4 steps - Pooling in after cloning - fceratto@cumin1003
Fri, Dec 5
Finished cloning db1233.eqiad.wmnet to db1229.eqiad.wmnet - fceratto@cumin1003
Completed pool of db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning - fceratto@cumin1003
Start pool of db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning - fceratto@cumin1003
Completed depool of db1233 - Depool db1233.eqiad.wmnet to then clone it to db1229.eqiad.wmnet - fceratto@cumin1003 - fceratto@cumin1003
Started cloning db1233.eqiad.wmnet to db1229.eqiad.wmnet - fceratto@cumin1003
Thu, Dec 4
Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host lvs1019.eqiad.wmnet with OS bullseye completed:
- lvs1019 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512041748_brett_2891956_lvs1019.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host lvs1019.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.k8s.pool-depool-node started by bking@cumin2002 pool for host dse-k8s-worker2003.codfw.wmnet completed:
- dse-k8s-worker2003.codfw.wmnet (PASS)
- Host dse-k8s-worker2003.codfw.wmnet pooled in dse-codfw
Cookbook cookbooks.sre.k8s.pool-depool-node started by bking@cumin2002 depool for host dse-k8s-worker2003.codfw.wmnet completed:
- dse-k8s-worker2003.codfw.wmnet (PASS)
- Host dse-k8s-worker2003.codfw.wmnet depooled from dse-codfw
Wed, Dec 3
Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host lvs1020.eqiad.wmnet with OS bullseye completed:
- lvs1020 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Checked BIOS boot parameters are back to normal
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512031928_brett_2167269_lvs1020.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host lvs1020.eqiad.wmnet with OS bullseye
Icinga downtime and Alertmanager silence (ID=75dbea13-fb1a-488c-9eb3-b67933f2ebaf) set by jynus@cumin1003 for 2 days, 0:00:00 on 1 host(s) and their services with reason: crashed
db1229.eqiad.wmnet
Completed pool of db1169 gradually with 4 steps - Repooling db1169 - marostegui@cumin1003
Start pool of db1169 gradually with 4 steps - Repooling db1169 - marostegui@cumin1003
Completed depool of db1169 - Depooling db1169 - marostegui@cumin1003
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1169.eqiad.wmnet with OS trixie completed:
- db1169 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Add puppet_version metadata (7) to Debian installer
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512030605_marostegui_501408_db1169.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Depooled pc1011.eqiad.wmnet and pc2011.codfw.wmnet Schema change - marostegui@cumin1003 - T411497
Depooled pc1011.eqiad.wmnet and pc2011.codfw.wmnet Schema change - marostegui@cumin1003 - T411497
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1169.eqiad.wmnet with OS trixie
Tue, Dec 2
Finished cloning db1251.eqiad.wmnet to db1169.eqiad.wmnet - marostegui@cumin1003
Completed pool of db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning - marostegui@cumin1003
Start pool of db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning - marostegui@cumin1003