Page MenuHomePhabricator

cloudcontrol1007: move to new network setup
Closed, ResolvedPublic

Description

The cloudcontrol1007 server should move to a new network setup.

We should:

  • drop wikimedia.org domain in favor of .eqiad.wmnet.
  • drop connection to asw
  • add private.eqiad.wikimedia.cloud address

Following procedure at https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Rename_while_reimaging

Event Timeline

taavi updated the task description. (Show Details)

Change 959677 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] Remove most references to cloudcontrol1007

https://gerrit.wikimedia.org/r/959677

Change 959677 merged by Majavah:

[operations/puppet@production] Remove most references to cloudcontrol1007

https://gerrit.wikimedia.org/r/959677

cookbooks.sre.hosts.decommission executed by taavi@cumin1001 for hosts: cloudcontrol1007.wikimedia.org

  • cloudcontrol1007.wikimedia.org (PASS)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Downtimed management interface on Alertmanager
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
taavi subscribed.

hi @Jclark-ctr! This server has been powered off and can be moved at any time to E4. thanks!

@taavi Relovated to rack E 4. updated netbox with location. switch port is #9

Change 960642 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/puppet@production] site: re-assign role for cloudcontrol1007

https://gerrit.wikimedia.org/r/960642

Change 960642 merged by Majavah:

[operations/puppet@production] site: re-assign role for cloudcontrol1007

https://gerrit.wikimedia.org/r/960642

Cookbook cookbooks.sre.hosts.reimage was started by taavi@cumin1001 for host cloudcontrol1007.eqiad.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by taavi@cumin1001 for host cloudcontrol1007.eqiad.wmnet with OS bullseye completed:

  • cloudcontrol1007 (PASS)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202309260708_taavi_1741001_cloudcontrol1007.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active
    • The sre.puppet.sync-netbox-hiera cookbook was run successfully
    • Cleared switch DHCP cache and MAC table for the host IP and MAC (row E/F)