Bot managed by SRE for automated interaction with Phabricator from monitoring tools.
User Details
- User Since
- Aug 12 2016, 1:45 PM (495 w, 1 d)
- Roles
- Bot
- Availability
- Available
- LDAP User
- Unknown
- MediaWiki User
- Unknown
Fri, Feb 6
Thu, Feb 5
Icinga downtime and Alertmanager silence (ID=785b501b-5e53-43b0-b903-5d93372eb8e1) set by cmooney@cumin1003 for 1 day, 0:00:00 on 2 host(s) and their services with reason: fundraising migration eqiad
fasw2-e15a-eqiad,fasw2-e15b-eqiad
cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: puppetmaster2001.codfw.wmnet
- puppetmaster2001.codfw.wmnet (PASS)
- Downtimed host on Icinga/Alertmanager
- Found physical host
- Downtimed management interface on Alertmanager
- Wiped all swraid, partition-table and filesystem signatures
- Powered off
- [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
- Configured the linked switch interface(s)
- Removed from DebMonitor
- Removed from Puppet master and PuppetDB
Icinga downtime and Alertmanager silence (ID=5da72ec9-7626-47d2-bc98-a871f93d717e) set by cmooney@cumin1003 for 1 day, 0:00:00 on 3 host(s) and their services with reason: fundraising migration eqiad
fasw2-c1a-eqiad,fasw2-c1b-eqiad,pfw1-eqiad
Completed pool of db2204 gradually with 4 steps - After schema change - marostegui@cumin1003
Start pool of db2204 gradually with 4 steps - After schema change - marostegui@cumin1003
Completed pool of db2205 gradually with 4 steps - After schema change - marostegui@cumin1003
Start pool of db2205 gradually with 4 steps - After schema change - marostegui@cumin1003
Wed, Feb 4
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host bast1004.wikimedia.org with OS trixie completed:
- bast1004 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602041724_jclark_2648072_bast1004.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host bast1004.wikimedia.org with OS trixie
Completed pool of db1236 gradually with 4 steps - After schema change - marostegui@cumin1003
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host bast1004.wikimedia.org with OS trixie executed with errors:
- bast1004 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console bast1004.wikimedia.org" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host bast1004.wikimedia.org with OS trixie
Start pool of db1236 gradually with 4 steps - After schema change - marostegui@cumin1003
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host bast1004.eqiad.wmnet with OS trixie executed with errors:
- bast1004 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console bast1004.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host bast1004.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1022.eqiad.wmnet with OS bullseye completed:
- ms-fe1022 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602041436_jclark_2543935_ms-fe1022.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1023.eqiad.wmnet with OS bullseye completed:
- ms-fe1023 (PASS)
- Host successfully migrated to the new VLAN
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602041434_jclark_2543945_ms-fe1023.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1023.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1022.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1023.eqiad.wmnet with OS bullseye executed with errors:
- ms-fe1023 (FAIL)
- Host successfully migrated to the new VLAN
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ms-fe1023.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1022.eqiad.wmnet with OS bullseye executed with errors:
- ms-fe1022 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ms-fe1022.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1022.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1024.eqiad.wmnet with OS bullseye completed:
- ms-fe1024 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602041305_jclark_2528735_ms-fe1024.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1021.eqiad.wmnet with OS bullseye completed:
- ms-fe1021 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh bullseye OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602041244_jclark_2527779_ms-fe1021.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1023.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1023.eqiad.wmnet with OS bullseye executed with errors:
- ms-fe1023 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ms-fe1023.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1024.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1023.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1021.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jynus@cumin1003 for host backup1015.eqiad.wmnet with OS trixie completed:
- backup1015 (WARN)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602041145_jynus_2518658_backup1015.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jynus@cumin1003 for host backup1015.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1006.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1006 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040110_jclark_2437768_tools-k8s-worker1006.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1008.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1008 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040107_jclark_2438571_tools-k8s-worker1008.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1007.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1007 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040103_jclark_2437842_tools-k8s-worker1007.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1005.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1005 (PASS)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040059_jclark_2437713_tools-k8s-worker1005.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
- Updated Netbox status planned -> active
- The sre.puppet.sync-netbox-hiera cookbook was run successfully
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1008.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1004.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1004 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040033_jclark_2422713_tools-k8s-worker1004.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1007.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1003.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1003 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040029_jclark_2422762_tools-k8s-worker1003.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1006.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1005.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1002.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1002 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040025_jclark_2422203_tools-k8s-worker1002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-worker1001.eqiad.wmnet with OS trixie completed:
- tools-k8s-worker1001 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040021_jclark_2422123_tools-k8s-worker1001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-ctrl1001.eqiad.wmnet with OS trixie completed:
- tools-k8s-ctrl1001 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040017_jclark_2421485_tools-k8s-ctrl1001.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host tools-k8s-ctrl1002.eqiad.wmnet with OS trixie completed:
- tools-k8s-ctrl1002 (PASS)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602040013_jclark_2421475_tools-k8s-ctrl1002.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is optimal
- Icinga downtime removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1003.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1004.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1002.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-worker1001.eqiad.wmnet with OS trixie
Tue, Feb 3
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-ctrl1002.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host tools-k8s-ctrl1001.eqiad.wmnet with OS trixie
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host backup1015.eqiad.wmnet with OS bookworm executed with errors:
- backup1015 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console backup1015.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by cmooney@cumin1003 for host ms-fe1024.eqiad.wmnet with OS bullseye executed with errors:
- ms-fe1024 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ms-fe1024.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host backup1015.eqiad.wmnet with OS bookworm
Cookbook cookbooks.sre.hosts.reimage was started by cmooney@cumin1003 for host ms-fe1024.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by cmooney@cumin1003 for host ms-fe1024.eqiad.wmnet with OS bullseye executed with errors:
- ms-fe1024 (FAIL)
- Host successfully migrated to the new VLAN
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ms-fe1024.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ms-fe1021.eqiad.wmnet with OS bullseye executed with errors:
- ms-fe1021 (FAIL)
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ms-fe1021.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by cmooney@cumin1003 for host ms-fe1024.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host backup1015.eqiad.wmnet with OS bookworm executed with errors:
- backup1015 (FAIL)
- Host successfully migrated to the new VLAN
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console backup1015.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1023.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ms-fe1021.eqiad.wmnet with OS bullseye
Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host backup1015.eqiad.wmnet with OS bookworm
Mon, Feb 2
Completed pooling of db1193 by marostegui@cumin1003: After schema change
Starting pool of db1193 by marostegui@cumin1003: After schema change
Completed pooling of db1193 by marostegui@cumin1003: After schema change
Completed pooling of db1222 by marostegui@cumin1003: After schema change
Starting pool of db1193 by marostegui@cumin1003: After schema change
Starting pool of db1222 by marostegui@cumin1003: After schema change
Starting pool of db1222 by marostegui@cumin1003: After schema change
Completed pooling of db2249 by marostegui@cumin1003: After reimage
Starting pool of db2249 by marostegui@cumin1003: After reimage
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2249.codfw.wmnet with OS trixie completed:
- db2249 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced UEFI HTTP Boot for next reboot
- Host rebooted via Redfish
- Host up (Debian installer)
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202602020924_marostegui_898222_db2249.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2249.codfw.wmnet with OS trixie
Sat, Jan 31
Fri, Jan 30
Section s5: Wikis pplwiki set up on clouddb - marostegui@cumin1003
Section s5: Wikis pplwiki redacted - marostegui@cumin1003
Thu, Jan 29
Completed pooling of db1201 by marostegui@cumin1003: After schema change
Completed pooling of db1210 by marostegui@cumin1003: After schema change
Starting pool of db1201 by marostegui@cumin1003: After schema change
Starting pool of db1210 by marostegui@cumin1003: After schema change
Completed pooling of db2212 by marostegui@cumin1003: After schema change
Starting pool of db2212 by marostegui@cumin1003: After schema change
Wed, Jan 28
Completed pooling of db1163 by marostegui@cumin1003: After schema change
Starting pool of db1163 by marostegui@cumin1003: After schema change
Tue, Jan 27
Completed pooling of db2248 by marostegui@cumin1003: After reimage
Starting pool of db2248 by marostegui@cumin1003: After reimage
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db2248.codfw.wmnet with OS trixie completed:
- db2248 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Checked BIOS boot parameters are back to normal
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202601270739_marostegui_3854760_db2248.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB
Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db2248.codfw.wmnet with OS trixie
Completed depooling of db2248 by marostegui@cumin1003: Reimage
Mon, Jan 26
Finished cloning db1224.eqiad.wmnet to db1264.eqiad.wmnet - marostegui@cumin1003
Completed pool of db1264 gradually with 4 steps - Pool db1264.eqiad.wmnet in after cloning - marostegui@cumin1003
Start pool of db1264 gradually with 4 steps - Pool db1264.eqiad.wmnet in after cloning - marostegui@cumin1003
Completed pool of db1224 gradually with 4 steps - Pool db1224.eqiad.wmnet in after cloning - marostegui@cumin1003
Section s5: Wikis kajwiki set up on clouddb - marostegui@cumin1003
Section s5: Wikis kajwiki redacted - marostegui@cumin1003
Start pool of db1224 gradually with 4 steps - Pool db1224.eqiad.wmnet in after cloning - marostegui@cumin1003
Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1264.eqiad.wmnet with OS trixie completed:
- db1264 (WARN)
- Downtimed on Icinga/Alertmanager
- Disabled Puppet
- Removed from Puppet and PuppetDB if present and deleted any certificates
- Removed from Debmonitor if present
- Forced PXE for next reboot
- Host rebooted via IPMI
- Host up (Debian installer)
- Checked BIOS boot parameters are back to normal
- Host up (new fresh trixie OS)
- Generated Puppet certificate
- Signed new Puppet certificate
- Run Puppet in NOOP mode to populate exported resources in PuppetDB
- Found Nagios_host resource for this host in PuppetDB
- Downtimed the new host on Icinga/Alertmanager
- Removed previous downtime on Alertmanager (old OS)
- First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202601260908_marostegui_3710388_db1264.out
- configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
- Rebooted
- Automatic Puppet run was successful
- Forced a re-check of all Icinga services for the host
- Icinga status is not optimal, downtime not removed
- Updated Netbox data from PuppetDB