Page MenuHomePhabricator

ops-monitoring-bot (Operations Monitoring Bot)
UserBot

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Aug 12 2016, 1:45 PM (396 w, 3 d)
Roles
Bot
Availability
Available
LDAP User
Unknown
MediaWiki User
Unknown

Bot managed by SRE for automated interaction with Phabricator from monitoring tools.

Recent Activity

Yesterday

ops-monitoring-bot added a comment to T357748: Migrate CAS to Bookworm.

Cookbook cookbooks.sre.hosts.reimage started by slyngshede@cumin1002 for host idp-test1003.wikimedia.org with OS bookworm completed:

  • idp-test1003 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403181324_slyngshede_1731844_idp-test1003.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Mon, Mar 18, 1:40 PM · CAS-SSO, Infrastructure-Foundations
ops-monitoring-bot added a comment to T357748: Migrate CAS to Bookworm.

Cookbook cookbooks.sre.hosts.reimage was started by slyngshede@cumin1002 for host idp-test1003.wikimedia.org with OS bookworm

Mon, Mar 18, 1:12 PM · CAS-SSO, Infrastructure-Foundations

Sun, Mar 17

ops-monitoring-bot added a comment to T354561: Decommission restbase10[19-27].

Icinga downtime and Alertmanager silence (ID=d3a5c1f5-1151-488d-9865-6d4a78317d3e) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1023.eqiad.wmnet
Sun, Mar 17, 7:04 PM · Cassandra

Fri, Mar 15

ops-monitoring-bot added a comment to T354561: Decommission restbase10[19-27].

Icinga downtime and Alertmanager silence (ID=be085aec-234a-4a77-9925-9a904e432a89) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1022.eqiad.wmnet
Fri, Mar 15, 9:45 AM · Cassandra

Thu, Mar 14

ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage started by arnaudb@cumin1002 for host db2115.codfw.wmnet with OS bookworm completed:

  • db2115 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403140738_arnaudb_958365_db2115.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Thu, Mar 14, 7:59 AM · DBA
ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage was started by arnaudb@cumin1002 for host db2115.codfw.wmnet with OS bookworm

Thu, Mar 14, 7:20 AM · DBA

Wed, Mar 13

ops-monitoring-bot added a comment to T354561: Decommission restbase10[19-27].

Icinga downtime and Alertmanager silence (ID=900e92f1-66a1-4f63-94ae-fed3ae46a79c) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1021.eqiad.wmnet
Wed, Mar 13, 4:34 PM · Cassandra
ops-monitoring-bot added a comment to T359940: hw troubleshooting: Unidentified for db1246.eqiad.wmnet.

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1002 for host db1246.eqiad.wmnet with OS bookworm completed:

  • db1246 (WARN)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403131504_marostegui_837493_db1246.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status failed -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
Wed, Mar 13, 3:25 PM · DBA, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T359940: hw troubleshooting: Unidentified for db1246.eqiad.wmnet.

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1002 for host db1246.eqiad.wmnet with OS bookworm

Wed, Mar 13, 2:49 PM · DBA, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T359940: hw troubleshooting: Unidentified for db1246.eqiad.wmnet.

Cookbook cookbooks.sre.hosts.reimage started by marostegui@cumin1002 for host db1246.eqiad.wmnet with OS bookworm executed with errors:

  • db1246 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Unable to disable Puppet, the host may have been unreachable
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • The reimage failed, see the cookbook logs for the details,You can also try typing "install-console" db1246.eqiad.wmnet to get a root shellbut depending on the failure this may not work.
Wed, Mar 13, 2:49 PM · DBA, SRE, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T359940: hw troubleshooting: Unidentified for db1246.eqiad.wmnet.

Cookbook cookbooks.sre.hosts.reimage was started by marostegui@cumin1002 for host db1246.eqiad.wmnet with OS bookworm

Wed, Mar 13, 2:33 PM · DBA, SRE, ops-eqiad, DC-Ops

Tue, Mar 12

ops-monitoring-bot added a comment to T358559: Switchover gitlab replica (gitlab1004 -> gitlab1003) - March 2024.

Cookbook cookbooks.sre.gitlab.failover (Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org) encountered errors. Rollback started

Tue, Mar 12, 2:56 PM · Patch-For-Review, collaboration-services
ops-monitoring-bot added a comment to T358559: Switchover gitlab replica (gitlab1004 -> gitlab1003) - March 2024.

Cookbook cookbooks.sre.gitlab.failover (Failover of gitlab from gitlab1004.wikimedia.org to gitlab1003.wikimedia.org) started

Tue, Mar 12, 12:26 PM · Patch-For-Review, collaboration-services
ops-monitoring-bot added a comment to T354561: Decommission restbase10[19-27].

Icinga downtime and Alertmanager silence (ID=a60145fd-738c-4b55-9a7c-21eabfe44a70) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1020.eqiad.wmnet
Tue, Mar 12, 7:06 AM · Cassandra

Mon, Mar 11

ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=61bc3c58-7fa2-414a-af00-2965bed05d3a) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2211.codfw.wmnet - T355422

db2211.codfw.wmnet
Mon, Mar 11, 2:59 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=d8f412d8-0f40-4d49-ab70-43d9b5bc93f5) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2211.codfw.wmnet - T355422

db2111.codfw.wmnet
Mon, Mar 11, 2:59 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=20e9fe3c-a582-486d-bc04-3369d1bb853e) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2210.codfw.wmnet - T355422

db2210.codfw.wmnet
Mon, Mar 11, 2:55 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=70c6e0c8-6dfb-42ae-b60d-afaf69b1fa28) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2210.codfw.wmnet - T355422

db2110.codfw.wmnet
Mon, Mar 11, 2:55 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=a4962fb8-99b1-40f0-ba67-de7c273b2ad4) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2209.codfw.wmnet - T355422

db2209.codfw.wmnet
Mon, Mar 11, 2:49 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=b3c6512b-3b81-4036-ba80-6bf31f385e0d) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2209.codfw.wmnet - T355422

db2109.codfw.wmnet
Mon, Mar 11, 2:49 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage started by arnaudb@cumin1002 for host db1179.eqiad.wmnet with OS bookworm completed:

  • db1179 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403111356_arnaudb_483161_db1179.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Mon, Mar 11, 2:17 PM · DBA
ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage was started by arnaudb@cumin1002 for host db1179.eqiad.wmnet with OS bookworm

Mon, Mar 11, 1:42 PM · DBA

Sun, Mar 10

ops-monitoring-bot added a comment to T354561: Decommission restbase10[19-27].

Icinga downtime and Alertmanager silence (ID=9796a311-df1d-4f2e-bb25-b53d9c7867e8) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Decommissioning — T354561

restbase1019.eqiad.wmnet
Sun, Mar 10, 2:19 PM · Cassandra

Sat, Mar 9

ops-monitoring-bot added a comment to T354560: Provision new RESTBase cluster nodes: restbase10[34-42].

Icinga downtime and Alertmanager silence (ID=3ddd9fec-9ff7-4c5f-a3c7-e776f525c299) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Bootstrapping — T354560

restbase1042.eqiad.wmnet
Sat, Mar 9, 5:26 PM · Cassandra
ops-monitoring-bot created T359742: Degraded RAID on elastic2037.
Sat, Mar 9, 12:42 PM · SRE, ops-codfw
ops-monitoring-bot created T359702: Degraded RAID on dumpsdata1007.
Sat, Mar 9, 3:46 AM · Data-Engineering, SRE, ops-eqiad

Fri, Mar 8

ops-monitoring-bot added a comment to T354560: Provision new RESTBase cluster nodes: restbase10[34-42].

Icinga downtime and Alertmanager silence (ID=e000d69c-989d-4668-aa8b-6a2d62f0f12e) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Bootstrapping — T354560

restbase1041.eqiad.wmnet
Fri, Mar 8, 2:12 PM · Cassandra
ops-monitoring-bot added a comment to T358761: Deploy OVS test setup in codfw1dev.

Cookbook cookbooks.sre.hosts.reimage started by taavi@cumin1002 for host cloudvirt2001-dev.codfw.wmnet with OS bookworm completed:

  • cloudvirt2001-dev (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403080940_taavi_945517_cloudvirt2001-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Fri, Mar 8, 10:08 AM · cloud-services-team (FY2023/2024-Q3-Q4), User-aborrero, Cloud-VPS
ops-monitoring-bot added a comment to T358761: Deploy OVS test setup in codfw1dev.

Cookbook cookbooks.sre.hosts.reimage was started by taavi@cumin1002 for host cloudvirt2001-dev.codfw.wmnet with OS bookworm

Fri, Mar 8, 9:17 AM · cloud-services-team (FY2023/2024-Q3-Q4), User-aborrero, Cloud-VPS

Thu, Mar 7

ops-monitoring-bot added a comment to T359597: db2124 depooled with index corruption.

Icinga downtime and Alertmanager silence (ID=17885a36-8547-4a13-afea-8d73c87e272d) set by rzl@cumin2002 for 5 days, 0:00:00 on 1 host(s) and their services with reason: index corruption

db2124.codfw.wmnet
Thu, Mar 7, 10:17 PM · DBA
ops-monitoring-bot added a comment to T355355: Q3:rack/setup/install dbprov200[56].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host dbprov2006.codfw.wmnet with OS bullseye

Thu, Mar 7, 4:38 PM · SRE, Data-Persistence, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T355355: Q3:rack/setup/install dbprov200[56].

Cookbook cookbooks.sre.hosts.reimage was started by jhancock@cumin2002 for host dbprov2005.codfw.wmnet with OS bullseye

Thu, Mar 7, 4:38 PM · SRE, Data-Persistence, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T357159: Site: 2 VM %request for etherpad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin2002 for hosts: etherpad2001.codfw.wmnet

  • etherpad2001.codfw.wmnet (WARN)
    • Missing DNSName in Nebox for etherpad2001, unable to verify it.
    • Missing DNS record for etherpad2001.codfw.wmnet, the steps requiring DNS will fail.
    • Host not found on Icinga, unable to downtime it
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster codfw to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster codfw to Netbox
Thu, Mar 7, 4:04 PM · collaboration-services, vm-requests, Infrastructure-Foundations, SRE
ops-monitoring-bot added a comment to T354560: Provision new RESTBase cluster nodes: restbase10[34-42].

Icinga downtime and Alertmanager silence (ID=89efcc49-7e96-437c-b6ba-7c599f843789) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Bootstrapping — T354560

restbase1040.eqiad.wmnet
Thu, Mar 7, 2:32 PM · Cassandra
ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage started by arnaudb@cumin1002 for host db1220.eqiad.wmnet with OS bookworm completed:

  • db1220 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403071028_arnaudb_771962_db1220.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Thu, Mar 7, 10:49 AM · DBA
ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage was started by arnaudb@cumin1002 for host db1220.eqiad.wmnet with OS bookworm

Thu, Mar 7, 10:14 AM · DBA

Wed, Mar 6

ops-monitoring-bot added a comment to T333615: Upgrade alert* hosts to Bookworm.

Cookbook cookbooks.sre.hosts.reimage started by denisse@cumin2002 for host alert1001.wikimedia.org with OS bookworm completed:

  • alert1001 (WARN)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Unable to downtime the new host on Icinga/Alertmanager, the sre.hosts.downtime cookbook returned 99
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061652_denisse_187623_alert1001.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 5:15 PM · SRE, SRE Observability (FY2023/2024-Q3)
ops-monitoring-bot added a comment to T333615: Upgrade alert* hosts to Bookworm.

Cookbook cookbooks.sre.hosts.reimage was started by denisse@cumin2002 for host alert1001.wikimedia.org with OS bookworm

Wed, Mar 6, 4:36 PM · SRE, SRE Observability (FY2023/2024-Q3)
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw1451.eqiad.wmnet with OS bullseye completed:

  • mw1451 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061542_cgoubert_118654_mw1451.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 4:00 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw1455.eqiad.wmnet with OS bullseye completed:

  • mw1455 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061539_cgoubert_118730_mw1455.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 3:57 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw1442.eqiad.wmnet with OS bullseye completed:

  • mw1442 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061536_cgoubert_118640_mw1442.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 3:56 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw1452.eqiad.wmnet with OS bullseye completed:

  • mw1452 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061534_cgoubert_118679_mw1452.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 3:51 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw1454.eqiad.wmnet with OS bullseye completed:

  • mw1454 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061531_cgoubert_118699_mw1454.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 3:50 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw1441.eqiad.wmnet with OS bullseye completed:

  • mw1441 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061529_cgoubert_118620_mw1441.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 3:48 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T354560: Provision new RESTBase cluster nodes: restbase10[34-42].

Icinga downtime and Alertmanager silence (ID=a0cdf9c7-04e3-464c-8811-cbde4504c509) set by eevans@cumin1002 for 30 days, 0:00:00 on 1 host(s) and their services with reason: Bootstrapping — T354560

restbase1039.eqiad.wmnet
Wed, Mar 6, 3:44 PM · Cassandra
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=9a44a351-0c9d-4104-bdfc-0dd8973df120) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2131.codfw.wmnet - T355422

db2131.codfw.wmnet
Wed, Mar 6, 3:24 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=82ddfaab-b71b-46f0-9d9c-ea141a03f956) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2131.codfw.wmnet - T355422

db2196.codfw.wmnet
Wed, Mar 6, 3:24 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage started by arnaudb@cumin1002 for host db2131.codfw.wmnet with OS bookworm completed:

  • db2131 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061454_arnaudb_625899_db2131.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 3:17 PM · DBA
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw1455.eqiad.wmnet with OS bullseye

Wed, Mar 6, 3:13 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw1454.eqiad.wmnet with OS bullseye

Wed, Mar 6, 3:13 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw1452.eqiad.wmnet with OS bullseye

Wed, Mar 6, 3:13 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw1451.eqiad.wmnet with OS bullseye

Wed, Mar 6, 3:13 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw1442.eqiad.wmnet with OS bullseye

Wed, Mar 6, 3:13 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw1441.eqiad.wmnet with OS bullseye

Wed, Mar 6, 3:13 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358642: Upgrade x1 to MariaDB 10.6.

Cookbook cookbooks.sre.hosts.reimage was started by arnaudb@cumin1002 for host db2131.codfw.wmnet with OS bookworm

Wed, Mar 6, 2:35 PM · DBA
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2004.codfw.wmnet with OS bullseye completed:

  • parse2004 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061311_akosiaris_600477_parse2004.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 1:30 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2007.codfw.wmnet with OS bullseye completed:

  • parse2007 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061308_akosiaris_600711_parse2007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 1:27 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2006.codfw.wmnet with OS bullseye completed:

  • parse2006 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061306_akosiaris_600632_parse2006.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 1:24 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2003.codfw.wmnet with OS bullseye completed:

  • parse2003 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061303_akosiaris_600397_parse2003.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 1:24 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2005.codfw.wmnet with OS bullseye completed:

  • parse2005 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061301_akosiaris_600564_parse2005.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 1:20 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2002.codfw.wmnet with OS bullseye completed:

  • parse2002 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061259_akosiaris_600232_parse2002.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 1:17 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw2374.codfw.wmnet with OS bullseye completed:

  • mw2374 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061233_cgoubert_4170510_mw2374.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 12:52 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw2373.codfw.wmnet with OS bullseye completed:

  • mw2373 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061230_cgoubert_4170437_mw2373.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 12:49 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw2376.codfw.wmnet with OS bullseye completed:

  • mw2376 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061227_cgoubert_4170644_mw2376.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 12:46 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2007.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:43 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw2371.codfw.wmnet with OS bullseye completed:

  • mw2371 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061224_cgoubert_4170137_mw2371.out
    • Unable to run puppet on config-master2001.codfw.wmnet,config-master1001.eqiad.wmnet to update configmaster.wikimedia.org with the new host SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 12:43 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2006.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:43 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2005.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:42 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2004.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:42 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2003.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:41 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw2372.codfw.wmnet with OS bullseye completed:

  • mw2372 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061222_cgoubert_4170325_mw2372.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 12:41 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2002.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:40 PM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage started by cgoubert@cumin2002 for host mw2375.codfw.wmnet with OS bullseye completed:

  • mw2375 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403061220_cgoubert_4170590_mw2375.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 12:39 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw2376.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:03 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw2375.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:03 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw2374.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:02 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw2373.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:02 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw2372.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:02 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T351074: Move servers from the appserver/api cluster to kubernetes.

Cookbook cookbooks.sre.hosts.reimage was started by cgoubert@cumin2002 for host mw2371.codfw.wmnet with OS bullseye

Wed, Mar 6, 12:02 PM · Patch-For-Review, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2011.codfw.wmnet with OS bullseye completed:

  • parse2011 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060920_akosiaris_554491_parse2011.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 9:39 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2010.codfw.wmnet with OS bullseye completed:

  • parse2010 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060916_akosiaris_554388_parse2010.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 9:35 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2014.codfw.wmnet with OS bullseye completed:

  • parse2014 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060913_akosiaris_554787_parse2014.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 9:32 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2012.codfw.wmnet with OS bullseye completed:

  • parse2012 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060911_akosiaris_554559_parse2012.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 9:30 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2015.codfw.wmnet with OS bullseye completed:

  • parse2015 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060908_akosiaris_554908_parse2015.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 9:27 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2013.codfw.wmnet with OS bullseye completed:

  • parse2013 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060906_akosiaris_554662_parse2013.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
Wed, Mar 6, 9:25 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2008.codfw.wmnet with OS bullseye completed:

  • parse2008 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060903_akosiaris_554266_parse2008.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 9:23 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage started by akosiaris@cumin1002 for host parse2009.codfw.wmnet with OS bullseye completed:

  • parse2009 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202403060901_akosiaris_554308_parse2009.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)
Wed, Mar 6, 9:21 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=93d8653e-b391-4cf0-8df1-ab3070719ceb) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2208.codfw.wmnet - T355422

db2208.codfw.wmnet
Wed, Mar 6, 9:04 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=d0cfe651-f2a4-4c58-8fa9-ed1ebe378f39) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2208.codfw.wmnet - T355422

db2108.codfw.wmnet
Wed, Mar 6, 9:04 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=ac5e8712-4a47-46cb-a2d2-acf644e54e9b) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2206.codfw.wmnet - T355422

db2206.codfw.wmnet
Wed, Mar 6, 8:58 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=86c64dcb-907e-4772-8b0e-a13ac85551b4) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2206.codfw.wmnet - T355422

db2106.codfw.wmnet
Wed, Mar 6, 8:58 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=c6f29093-cf3c-4716-b09a-5e1ce38612a0) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2205.codfw.wmnet - T355422

db2205.codfw.wmnet
Wed, Mar 6, 8:52 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T355422: Productionize db2196-db2220.

Icinga downtime and Alertmanager silence (ID=286eab79-0f53-4363-be23-f09f9cbce3b1) set by arnaudb@cumin1002 for 1 day, 0:00:00 on 1 host(s) and their services with reason: provisionning db2205.codfw.wmnet - T355422

db2105.codfw.wmnet
Wed, Mar 6, 8:52 AM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2015.codfw.wmnet with OS bullseye

Wed, Mar 6, 8:47 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2014.codfw.wmnet with OS bullseye

Wed, Mar 6, 8:46 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2013.codfw.wmnet with OS bullseye

Wed, Mar 6, 8:46 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2012.codfw.wmnet with OS bullseye

Wed, Mar 6, 8:45 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2011.codfw.wmnet with OS bullseye

Wed, Mar 6, 8:44 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2010.codfw.wmnet with OS bullseye

Wed, Mar 6, 8:43 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s
ops-monitoring-bot added a comment to T358752: Reimage parse* hosts as kubernetes nodes.

Cookbook cookbooks.sre.hosts.reimage was started by akosiaris@cumin1002 for host parse2009.codfw.wmnet with OS bullseye

Wed, Mar 6, 8:43 AM · Content-Transform-Team, Release-Engineering-Team (Seen), SRE, Traffic, serviceops, MW-on-K8s