Page MenuHomePhabricator

Move ganeti2033 and ganeti2034 to new codfw rows A/B switches
Closed, ResolvedPublic

Description

ganeti2033 and ganeti2034 will be used for the next step of the routed Ganeti setup

For that it's preferred to have them connected to the new lsw switches.

As they're currently unused, the next steps is to follow this procedure : https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Move_existing_server_between_rows/racks,_changing_IPs

Event Timeline

cookbooks.sre.hosts.decommission executed by pt1979@cumin2002 for hosts: ganeti2033.codfw.wmnet

  • ganeti2033.codfw.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Downtimed management interface on Alertmanager
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

cookbooks.sre.hosts.decommission executed by pt1979@cumin2002 for hosts: ganeti2034.codfw.wmnet

  • ganeti2034.codfw.wmnet (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Downtimed management interface on Alertmanager
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

ganeti2033 on xe-0/0/8 on lsw1-b7-codfw
ganeti2034 on xe-0/0/12 on lsw1-a4-codfw

Cookbook cookbooks.sre.hosts.reimage was started by ayounsi@cumin1002 for host ganeti2033.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by ayounsi@cumin1002 for host ganeti2033.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by ayounsi@cumin1002 for host ganeti2034.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by ayounsi@cumin1002 for host ganeti2033.codfw.wmnet with OS bookworm completed:

  • ganeti2033 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202401091337_ayounsi_873076_ganeti2033.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)

Cookbook cookbooks.sre.hosts.reimage was started by ayounsi@cumin1002 for host ganeti2034.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by ayounsi@cumin1002 for host ganeti2034.codfw.wmnet with OS bookworm completed:

  • ganeti2034 (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202401091424_ayounsi_882549_ganeti2034.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (EVPN Switch)