Page MenuHomePhabricator

ops-monitoring-bot (Operations Monitoring Bot)
UserBot

Projects (3)

Today

  • No visible events.

Tomorrow

  • No visible events.

Friday

  • No visible events.

User Details

User Since
Aug 12 2016, 1:45 PM (487 w, 4 d)
Roles
Bot
Availability
Available
LDAP User
Unknown
MediaWiki User
Unknown

Bot managed by SRE for automated interaction with Phabricator from monitoring tools.

Recent Activity

Yesterday

ops-monitoring-bot added a comment to T412807: Dell R740xd reimage fails in debian-installer, configures IP on incorrect interface.

Cookbook cookbooks.sre.hosts.reimage started by cmooney@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:

  • es2028 (FAIL)
    • Host successfully migrated to the new VLAN
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Tue, Dec 16, 6:38 PM · Infrastructure-Foundations, SRE
ops-monitoring-bot added a comment to T412807: Dell R740xd reimage fails in debian-installer, configures IP on incorrect interface.

Cookbook cookbooks.sre.hosts.reimage was started by cmooney@cumin1003 for host es2028.codfw.wmnet with OS trixie

Tue, Dec 16, 5:14 PM · Infrastructure-Foundations, SRE
ops-monitoring-bot added a comment to T412825: Deploy Phab/Phorge 2025-12-16.

Icinga downtime and Alertmanager silence (ID=6fbc002b-f58a-4a1e-8c1b-328bb8dc684d) set by jelto@cumin1003 for 0:30:00 on 1 host(s) and their services with reason: Phabricator deploy

phab2002.codfw.wmnet
Tue, Dec 16, 4:02 PM · Release-Engineering-Team (Doing 😎), User-brennen, collaboration-services, Phabricator (2025-12-16)
ops-monitoring-bot added a comment to T412825: Deploy Phab/Phorge 2025-12-16.

Icinga downtime and Alertmanager silence (ID=767d0770-34be-4080-b717-6fcd10af3517) set by jelto@cumin1003 for 0:30:00 on 1 host(s) and their services with reason: Phabricator deploy

phab1004.eqiad.wmnet
Tue, Dec 16, 4:02 PM · Release-Engineering-Team (Doing 😎), User-brennen, collaboration-services, Phabricator (2025-12-16)
ops-monitoring-bot added a comment to T407472: Install a testing db with Debian Trixie.

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:

  • es2028 (FAIL)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Tue, Dec 16, 1:55 PM · DBA
ops-monitoring-bot added a comment to T365798: Shutdown of Puppet 5 servers.

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: puppetmaster1003.eqiad.wmnet

  • puppetmaster1003.eqiad.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Downtimed management interface on Alertmanager
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Tue, Dec 16, 12:55 PM · Patch-For-Review, Puppet-Infrastructure, SRE, Infrastructure-Foundations
ops-monitoring-bot added a comment to T407472: Install a testing db with Debian Trixie.

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie

Tue, Dec 16, 12:35 PM · DBA
ops-monitoring-bot added a comment to T407472: Install a testing db with Debian Trixie.

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:

  • es2028 (FAIL)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Tue, Dec 16, 11:23 AM · DBA
ops-monitoring-bot added a comment to T407472: Install a testing db with Debian Trixie.

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie

Tue, Dec 16, 10:46 AM · DBA
ops-monitoring-bot added a comment to T407472: Install a testing db with Debian Trixie.

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie executed with errors:

  • es2028 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console es2028.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Tue, Dec 16, 10:45 AM · DBA
ops-monitoring-bot added a comment to T407472: Install a testing db with Debian Trixie.

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host es2028.codfw.wmnet with OS trixie

Tue, Dec 16, 10:34 AM · DBA
ops-monitoring-bot added a comment to T370018: gitlab2002: wrong network for public IPV4 and IPV6.

Host gitlab1003.wikimedia.org rebooted by jelto@cumin1003 with reason: maintenance reboot for new subnets

Tue, Dec 16, 9:58 AM · Patch-For-Review, collaboration-services, Infrastructure-Foundations, SRE
ops-monitoring-bot added a comment to T370018: gitlab2002: wrong network for public IPV4 and IPV6.

Host gitlab2002.wikimedia.org rebooted by jelto@cumin1003 with reason: maintenance reboot for new subnets

Tue, Dec 16, 9:46 AM · Patch-For-Review, collaboration-services, Infrastructure-Foundations, SRE
ops-monitoring-bot added a comment to T365798: Shutdown of Puppet 5 servers.

cookbooks.sre.hosts.decommission executed by jmm@cumin2002 for hosts: puppetmaster2002.codfw.wmnet

  • puppetmaster2002.codfw.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Downtimed management interface on Alertmanager
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Tue, Dec 16, 9:12 AM · Patch-For-Review, Puppet-Infrastructure, SRE, Infrastructure-Foundations

Mon, Dec 15

ops-monitoring-bot added a comment to T370018: gitlab2002: wrong network for public IPV4 and IPV6.

Host gitlab2002.wikimedia.org rebooted by jelto@cumin1003 with reason: maintenance reboot for new subnets

Mon, Dec 15, 2:03 PM · Patch-For-Review, collaboration-services, Infrastructure-Foundations, SRE

Fri, Dec 12

ops-monitoring-bot added a comment to T405406: Q2:rack/setup/install dse-k8s-worker200[45].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm completed:

  • dse-k8s-worker2005 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512121730_jhancock_4105277_dse-k8s-worker2005.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Fri, Dec 12, 7:36 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405406: Q2:rack/setup/install dse-k8s-worker200[45].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm

Fri, Dec 12, 5:16 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie completed:

  • ganeti-jumbo1002 (WARN)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512121404_jclark_4010144_ganeti-jumbo1002.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • Failed to run the sre.puppet.sync-netbox-hiera cookbook, run it manually
Fri, Dec 12, 2:51 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1003.eqiad.wmnet with OS trixie completed:

  • ganeti-jumbo1003 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512121418_jclark_4024268_ganeti-jumbo1003.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Fri, Dec 12, 2:51 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1003.eqiad.wmnet with OS trixie

Fri, Dec 12, 2:03 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie

Fri, Dec 12, 1:50 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie executed with errors:

  • wdqs1029 (FAIL)
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1029.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Fri, Dec 12, 1:22 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie

Fri, Dec 12, 1:22 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1032.eqiad.wmnet with OS trixie executed with errors:

  • wdqs1032 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512120914_gehel_3706738_wdqs1032.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1032.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Fri, Dec 12, 10:47 AM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1032.eqiad.wmnet with OS trixie

Fri, Dec 12, 8:51 AM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot created T412497: Degraded RAID on db2166.
Fri, Dec 12, 7:26 AM · DBA, DC-Ops, SRE, ops-codfw

Thu, Dec 11

ops-monitoring-bot added a comment to T405406: Q2:rack/setup/install dse-k8s-worker200[45].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm executed with errors:

  • dse-k8s-worker2005 (FAIL)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console dse-k8s-worker2005.codfw.wmnet" to get a root shell, but depending on the failure this may not work.
Thu, Dec 11, 11:25 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405406: Q2:rack/setup/install dse-k8s-worker200[45].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host dse-k8s-worker2004.codfw.wmnet with OS bookworm completed:

  • dse-k8s-worker2004 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512112219_jhancock_3621874_dse-k8s-worker2004.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Thu, Dec 11, 10:40 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405406: Q2:rack/setup/install dse-k8s-worker200[45].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host dse-k8s-worker2005.codfw.wmnet with OS bookworm

Thu, Dec 11, 10:05 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405406: Q2:rack/setup/install dse-k8s-worker200[45].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host dse-k8s-worker2004.codfw.wmnet with OS bookworm

Thu, Dec 11, 10:05 PM · Essential-Work, Data-Platform-SRE (2025.09.26 - 2025.10.17), SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1031.eqiad.wmnet with OS trixie executed with errors:

  • wdqs1031 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512112034_gehel_3605396_wdqs1031.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1031.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Thu, Dec 11, 9:52 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie executed with errors:

  • ganeti-jumbo1002 (FAIL)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console ganeti-jumbo1002.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Thu, Dec 11, 9:14 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host ganeti-jumbo1001.eqiad.wmnet with OS trixie completed:

  • ganeti-jumbo1001 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512112009_jclark_3604593_ganeti-jumbo1001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Thu, Dec 11, 8:50 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1031.eqiad.wmnet with OS trixie

Thu, Dec 11, 8:11 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1002.eqiad.wmnet with OS trixie

Thu, Dec 11, 7:54 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T405966: Q2:rack/setup/install ganeti-jumbo100[1-3].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host ganeti-jumbo1001.eqiad.wmnet with OS trixie

Thu, Dec 11, 7:52 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1030.eqiad.wmnet with OS trixie executed with errors:

  • wdqs1030 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111547_gehel_3532056_wdqs1030.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1030.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Thu, Dec 11, 5:20 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie executed with errors:

  • wdqs1029 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111543_gehel_3531760_wdqs1029.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console wdqs1029.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.
Thu, Dec 11, 5:17 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1030.eqiad.wmnet with OS trixie

Thu, Dec 11, 3:24 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1029.eqiad.wmnet with OS trixie

Thu, Dec 11, 3:22 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage started by gehel@cumin1003 for host wdqs1028.eqiad.wmnet with OS trixie completed:

  • wdqs1028 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111454_gehel_3523474_wdqs1028.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Thu, Dec 11, 3:13 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T412235: Dedicated puppet role to support testing of alternatives to Blazegraph.

Cookbook cookbooks.sre.hosts.reimage was started by gehel@cumin1003 for host wdqs1028.eqiad.wmnet with OS trixie

Thu, Dec 11, 2:33 PM · Wikidata, Wikidata-Query-Service, Data-Platform-SRE (2025.11.07 - 2025.11.28)
ops-monitoring-bot added a comment to T406796: Q2:rack/setup/install logging-sd100[567].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host logging-sd1007.eqiad.wmnet with OS bookworm completed:

  • logging-sd1007 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111350_jclark_3507488_logging-sd1007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Thu, Dec 11, 2:08 PM · Observability-Logging, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T406796: Q2:rack/setup/install logging-sd100[567].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host logging-sd1006.eqiad.wmnet with OS bookworm completed:

  • logging-sd1006 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111347_jclark_3507482_logging-sd1006.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Thu, Dec 11, 2:04 PM · Observability-Logging, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T406796: Q2:rack/setup/install logging-sd100[567].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host logging-sd1005.eqiad.wmnet with OS bookworm completed:

  • logging-sd1005 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111343_jclark_3507951_logging-sd1005.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Thu, Dec 11, 2:03 PM · Observability-Logging, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T406796: Q2:rack/setup/install logging-sd100[567].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host logging-sd1005.eqiad.wmnet with OS bookworm

Thu, Dec 11, 1:01 PM · Observability-Logging, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1025.eqiad.wmnet with OS bullseye completed:

  • aqs1025 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512111244_jclark_3502152_aqs1025.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Thu, Dec 11, 1:00 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T406796: Q2:rack/setup/install logging-sd100[567].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host logging-sd1006.eqiad.wmnet with OS bookworm

Thu, Dec 11, 12:54 PM · Observability-Logging, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T406796: Q2:rack/setup/install logging-sd100[567].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host logging-sd1007.eqiad.wmnet with OS bookworm

Thu, Dec 11, 12:54 PM · Observability-Logging, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1025.eqiad.wmnet with OS bullseye

Thu, Dec 11, 12:30 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops

Wed, Dec 10

ops-monitoring-bot added a comment to T405964: Q2:rack/setup/install ganeti-jumbo200[1-3].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host ganeti-jumbo2001.codfw.wmnet with OS trixie completed:

  • ganeti-jumbo2001 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102202_jhancock_3400817_ganeti-jumbo2001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 10:23 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405964: Q2:rack/setup/install ganeti-jumbo200[1-3].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host ganeti-jumbo2003.codfw.wmnet with OS trixie completed:

  • ganeti-jumbo2003 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102158_jhancock_3400851_ganeti-jumbo2003.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 10:15 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405964: Q2:rack/setup/install ganeti-jumbo200[1-3].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host ganeti-jumbo2002.codfw.wmnet with OS trixie completed:

  • ganeti-jumbo2002 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102154_jhancock_3400827_ganeti-jumbo2002.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 10:11 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405964: Q2:rack/setup/install ganeti-jumbo200[1-3].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host ganeti-jumbo2003.codfw.wmnet with OS trixie

Wed, Dec 10, 9:38 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405964: Q2:rack/setup/install ganeti-jumbo200[1-3].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host ganeti-jumbo2002.codfw.wmnet with OS trixie

Wed, Dec 10, 9:38 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T405964: Q2:rack/setup/install ganeti-jumbo200[1-3].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host ganeti-jumbo2001.codfw.wmnet with OS trixie

Wed, Dec 10, 9:38 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), Essential-Work, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1026.eqiad.wmnet with OS bullseye completed:

  • aqs1026 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102044_jclark_3385285_aqs1026.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 9:02 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1024.eqiad.wmnet with OS bullseye completed:

  • aqs1024 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102037_jclark_3385277_aqs1024.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 9:00 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1027.eqiad.wmnet with OS bullseye completed:

  • aqs1027 (WARN)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102041_jclark_3385290_aqs1027.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • Failed to run the sre.puppet.sync-netbox-hiera cookbook, run it manually
Wed, Dec 10, 9:00 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage started by jclark@cumin1003 for host aqs1023.eqiad.wmnet with OS bullseye completed:

  • aqs1023 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512102033_jclark_3383037_aqs1023.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 8:49 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1027.eqiad.wmnet with OS bullseye

Wed, Dec 10, 8:23 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1026.eqiad.wmnet with OS bullseye

Wed, Dec 10, 8:23 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1024.eqiad.wmnet with OS bullseye

Wed, Dec 10, 8:23 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T407032: Q2:rack/setup/install aqs102[3-7].

Cookbook cookbooks.sre.hosts.reimage was started by jclark@cumin1003 for host aqs1023.eqiad.wmnet with OS bullseye

Wed, Dec 10, 8:19 PM · SRE, Data-Persistence, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T411781: lvs1018: remove cross-rack links to rows A, C and D.

Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host lvs1018.eqiad.wmnet with OS bullseye completed:

  • lvs1018 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101948_brett_2964783_lvs1018.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Dec 10, 8:05 PM · DC-Ops, ops-eqiad, Infrastructure-Foundations, netops, SRE
ops-monitoring-bot added a comment to T411781: lvs1018: remove cross-rack links to rows A, C and D.

Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host lvs1018.eqiad.wmnet with OS bullseye

Wed, Dec 10, 7:29 PM · DC-Ops, ops-eqiad, Infrastructure-Foundations, netops, SRE
ops-monitoring-bot added a comment to T406795: Q2:rack/setup/install logging-sd200[567].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host logging-sd2005.codfw.wmnet with OS bookworm completed:

  • logging-sd2005 (WARN)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101748_jhancock_3318276_logging-sd2005.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • Failed to run the sre.puppet.sync-netbox-hiera cookbook, run it manually
Wed, Dec 10, 6:27 PM · Observability-Logging, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T406795: Q2:rack/setup/install logging-sd200[567].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host logging-sd2007.codfw.wmnet with OS bookworm completed:

  • logging-sd2007 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101717_jhancock_3318763_logging-sd2007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 6:26 PM · Observability-Logging, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T406795: Q2:rack/setup/install logging-sd200[567].

Cookbook cookbooks.sre.hosts.reimage started by jhancock@cumin1003 for host logging-sd2006.codfw.wmnet with OS bookworm completed:

  • logging-sd2006 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512101659_jhancock_3295799_logging-sd2006.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Dec 10, 5:18 PM · Observability-Logging, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T406795: Q2:rack/setup/install logging-sd200[567].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host logging-sd2007.codfw.wmnet with OS bookworm

Wed, Dec 10, 5:02 PM · Observability-Logging, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T406795: Q2:rack/setup/install logging-sd200[567].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host logging-sd2005.codfw.wmnet with OS bookworm

Wed, Dec 10, 5:02 PM · Observability-Logging, SRE, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T406795: Q2:rack/setup/install logging-sd200[567].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin1003 for host logging-sd2006.codfw.wmnet with OS bookworm

Wed, Dec 10, 4:42 PM · Observability-Logging, SRE, ops-codfw, DC-Ops

Tue, Dec 9

ops-monitoring-bot added a comment to T404115: sretest2009 test in nokia rack.

Cookbook cookbooks.sre.hosts.reimage started by cmooney@cumin1003 for host sretest2009.codfw.wmnet with OS trixie completed:

  • sretest2009 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512091954_cmooney_2929373_sretest2009.out
    • Unable to run puppet on config-master2001.codfw.wmnet,config-master1001.eqiad.wmnet to update configmaster.wikimedia.org with the new host SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Tue, Dec 9, 8:11 PM · SRE, DC-Ops, ops-codfw
ops-monitoring-bot added a comment to T404115: sretest2009 test in nokia rack.

Cookbook cookbooks.sre.hosts.reimage was started by cmooney@cumin1003 for host sretest2009.codfw.wmnet with OS trixie

Tue, Dec 9, 7:20 PM · SRE, DC-Ops, ops-codfw
ops-monitoring-bot added a comment to T409178: Nokia SR-Linux ARP resolution bug on v24.10.x+.

Icinga downtime and Alertmanager silence (ID=1aee9e7e-d36b-4c56-8cac-746f48098c6f) set by cmooney@cumin1003 for 2:00:00 on 2 host(s) and their services with reason: upgradiing sr-linux on Nokia switches codfw

ssw1-e1-codfw.mgmt,ssw1-f1-codfw.mgmt
Tue, Dec 9, 6:48 PM · Infrastructure-Foundations, netops, SRE
ops-monitoring-bot added a comment to T409178: Nokia SR-Linux ARP resolution bug on v24.10.x+.

Icinga downtime and Alertmanager silence (ID=2a98251c-6798-469c-a3de-57fcfb13969f) set by cmooney@cumin1003 for 2:00:00 on 17 host(s) and their services with reason: upgradiing sr-linux on Nokia switches codfw

lsw1-e[2,4-5]-codfw,lsw1-e[2,4-5]-codfw IPv6,lsw1-e[2,4-5]-codfw.mgmt,lsw1-f[2,4]-codfw,lsw1-f[2,4]-codfw IPv6,lsw1-f[2,4]-codfw.mgmt,ssw1-e1-codfw,ssw1-f1-codfw
Tue, Dec 9, 6:32 PM · Infrastructure-Foundations, netops, SRE
ops-monitoring-bot added a comment to T411805: Clone and restore db1229.

Completed pool of db1229 gradually with 4 steps - Pooling in after cloning - fceratto@cumin1003

Tue, Dec 9, 2:28 PM · DBA
ops-monitoring-bot added a comment to T411805: Clone and restore db1229.

Start pool of db1229 gradually with 4 steps - Pooling in after cloning - fceratto@cumin1003

Tue, Dec 9, 1:43 PM · DBA

Fri, Dec 5

ops-monitoring-bot added a comment to T411805: Clone and restore db1229.

Finished cloning db1233.eqiad.wmnet to db1229.eqiad.wmnet - fceratto@cumin1003

Fri, Dec 5, 11:42 AM · DBA
ops-monitoring-bot added a comment to T411805: Clone and restore db1229.

Completed pool of db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning - fceratto@cumin1003

Fri, Dec 5, 11:26 AM · DBA
ops-monitoring-bot added a comment to T411805: Clone and restore db1229.

Start pool of db1233 gradually with 4 steps - Pool db1233.eqiad.wmnet in after cloning - fceratto@cumin1003

Fri, Dec 5, 10:41 AM · DBA
ops-monitoring-bot added a comment to T411805: Clone and restore db1229.

Completed depool of db1233 - Depool db1233.eqiad.wmnet to then clone it to db1229.eqiad.wmnet - fceratto@cumin1003 - fceratto@cumin1003

Fri, Dec 5, 9:16 AM · DBA
ops-monitoring-bot added a comment to T411805: Clone and restore db1229.

Started cloning db1233.eqiad.wmnet to db1229.eqiad.wmnet - fceratto@cumin1003

Fri, Dec 5, 9:16 AM · DBA

Thu, Dec 4

ops-monitoring-bot added a comment to T405628: lvs1019: move primary uplink from asw2-c7-eqiad to lsw1-c7-eqiad and remove link to asw2-d2-eqiad.

Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host lvs1019.eqiad.wmnet with OS bullseye completed:

  • lvs1019 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512041748_brett_2891956_lvs1019.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Thu, Dec 4, 6:05 PM · DC-Ops, Traffic, ops-eqiad, netops, Infrastructure-Foundations, SRE
ops-monitoring-bot added a comment to T405628: lvs1019: move primary uplink from asw2-c7-eqiad to lsw1-c7-eqiad and remove link to asw2-d2-eqiad.

Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host lvs1019.eqiad.wmnet with OS bullseye

Thu, Dec 4, 5:30 PM · DC-Ops, Traffic, ops-eqiad, netops, Infrastructure-Foundations, SRE
ops-monitoring-bot added a comment to T408643: OpenSearch on K8s: Deploy an OpenSearch cluster in dse-k8s-codfw.

Cookbook cookbooks.sre.k8s.pool-depool-node started by bking@cumin2002 pool for host dse-k8s-worker2003.codfw.wmnet completed:

  • dse-k8s-worker2003.codfw.wmnet (PASS)
    • Host dse-k8s-worker2003.codfw.wmnet pooled in dse-codfw
Thu, Dec 4, 3:45 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), OKR-Work
ops-monitoring-bot added a comment to T408643: OpenSearch on K8s: Deploy an OpenSearch cluster in dse-k8s-codfw.

Cookbook cookbooks.sre.k8s.pool-depool-node started by bking@cumin2002 depool for host dse-k8s-worker2003.codfw.wmnet completed:

  • dse-k8s-worker2003.codfw.wmnet (PASS)
    • Host dse-k8s-worker2003.codfw.wmnet depooled from dse-codfw
Thu, Dec 4, 3:35 PM · Data-Platform-SRE (2025.11.07 - 2025.11.28), OKR-Work

Wed, Dec 3

ops-monitoring-bot added a comment to T405609: lvs1020: move primary uplink from asw2-d7-eqiad to lsw1-d7-eqiad and remove link to asw2-c2-eqiad.

Cookbook cookbooks.sre.hosts.reimage started by brett@cumin2002 for host lvs1020.eqiad.wmnet with OS bullseye completed:

  • lvs1020 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512031928_brett_2167269_lvs1020.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Wed, Dec 3, 7:51 PM · Traffic, ops-eqiad, netops, Infrastructure-Foundations, SRE, DC-Ops
ops-monitoring-bot added a comment to T405609: lvs1020: move primary uplink from asw2-d7-eqiad to lsw1-d7-eqiad and remove link to asw2-c2-eqiad.

Cookbook cookbooks.sre.hosts.reimage was started by brett@cumin2002 for host lvs1020.eqiad.wmnet with OS bullseye

Wed, Dec 3, 7:06 PM · Traffic, ops-eqiad, netops, Infrastructure-Foundations, SRE, DC-Ops
ops-monitoring-bot added a comment to T411652: db1229 crashed - Broken memory module at B7.

Icinga downtime and Alertmanager silence (ID=75dbea13-fb1a-488c-9eb3-b67933f2ebaf) set by jynus@cumin1003 for 2 days, 0:00:00 on 1 host(s) and their services with reason: crashed

db1229.eqiad.wmnet
Wed, Dec 3, 5:11 PM · SRE, DC-Ops, ops-eqiad, DBA
ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Completed pool of db1169 gradually with 4 steps - Repooling db1169 - marostegui@cumin1003

Wed, Dec 3, 7:26 AM · DBA
ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Start pool of db1169 gradually with 4 steps - Repooling db1169 - marostegui@cumin1003

Wed, Dec 3, 6:40 AM · DBA
ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Completed depool of db1169 - Depooling db1169 - marostegui@cumin1003

Wed, Dec 3, 6:29 AM · DBA
ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1003 for host db1169.eqiad.wmnet with OS trixie completed:

  • db1169 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced UEFI HTTP Boot for next reboot
    • Host rebooted via Redfish
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Host up (new fresh trixie OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202512030605_marostegui_501408_db1169.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Wed, Dec 3, 6:26 AM · DBA
ops-monitoring-bot added a comment to T411497: Drop modtoken and flags from cache tables.

Depooled pc1011.eqiad.wmnet and pc2011.codfw.wmnet Schema change - marostegui@cumin1003 - T411497

Wed, Dec 3, 6:15 AM · Schema-change-in-production, DBA, Data-Engineering
ops-monitoring-bot added a comment to T411497: Drop modtoken and flags from cache tables.

Depooled pc1011.eqiad.wmnet and pc2011.codfw.wmnet Schema change - marostegui@cumin1003 - T411497

Wed, Dec 3, 6:15 AM · Schema-change-in-production, DBA, Data-Engineering
ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1003 for host db1169.eqiad.wmnet with OS trixie

Wed, Dec 3, 5:41 AM · DBA

Tue, Dec 2

ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Finished cloning db1251.eqiad.wmnet to db1169.eqiad.wmnet - marostegui@cumin1003

Tue, Dec 2, 5:44 PM · DBA
ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Completed pool of db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning - marostegui@cumin1003

Tue, Dec 2, 4:36 PM · DBA
ops-monitoring-bot added a comment to T411498: Reclone db1169 (s1).

Start pool of db1251 gradually with 4 steps - Pool db1251.eqiad.wmnet in after cloning - marostegui@cumin1003

Tue, Dec 2, 3:50 PM · DBA