Page MenuHomePhabricator

rename cloudgw2001-dev into cloudlb2001-dev
Closed, ResolvedPublic


The cloudgw2001-dev server ( is going part of a PoC for the cloudlb project (see T324992: cloudlb: create PoC on codfw).

To avoid naming confusion, it would be good to rename the server to cloudlb2001-dev.


Event Timeline

aborrero triaged this task as Medium priority.Jan 25 2023, 2:02 PM
aborrero created this task.
aborrero added a subscriber: Papaul.

cookbooks.sre.hosts.decommission executed by aborrero@cumin2002 for hosts: cloudgw2001-dev.codfw.wmnet

  • cloudgw2001-dev.codfw.wmnet (WARN)
    • Downtimed host on Icinga/Alertmanager
    • Found physical host
    • Management interface not found on Icinga, unable to downtime it
    • Wiped all swraid, partition-table and filesystem signatures
    • Powered off
    • [Netbox] Set status to Decommissioning, deleted all non-mgmt IPs, updated switch interfaces (disabled, removed vlans, etc)
    • Configured the linked switch interface(s)
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

Change 884027 had a related patch set uploaded (by Arturo Borrero Gonzalez; author: Arturo Borrero Gonzalez):

[operations/puppet@production] cloudgw2001-dev: rename server to cloudlb2001-dev

Change 884027 merged by Arturo Borrero Gonzalez:

[operations/puppet@production] cloudgw2001-dev: rename server to cloudlb2001-dev

I can't run the reimage script because the server lacks primary IPv4:

aborrero@cumin2002:~$ sudo cookbook sre.hosts.reimage --os bullseye --new -t T327908 cloudlb2001-dev
==> ATTENTION: destructive action for host: cloudlb2001-dev
Are you sure to proceed?
Type "go" to proceed or "abort" to interrupt the execution
> go
Exception raised while initializing the Cookbook sre.hosts.reimage:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/spicerack/", line 219, in run
    runner = self.instance.get_runner(args)
  File "/srv/deployment/spicerack/cookbooks/sre/hosts/", line 88, in get_runner
    return ReimageRunner(args, self.spicerack)
  File "/srv/deployment/spicerack/cookbooks/sre/hosts/", line 107, in __init__
    self.fqdn = self.netbox_server.fqdn
  File "/usr/lib/python3/dist-packages/spicerack/", line 349, in fqdn
    raise NetboxError(f"Server {} does not have any primary IP with a DNS name set.")
spicerack.netbox.NetboxError: Server cloudlb2001-dev does not have any primary IP with a DNS name set.

Trying to fix that by using the script. For that, I deleted all interface information, otherwise the script would fail.

The ProvisionServerNetwork script changed the mgmt IP:

-cloudlb2001-dev                          1H IN A
+cloudlb2001-dev                          1H IN A

So changing that by hand to keep the same IP address.

Cookbook cookbooks.sre.hosts.reimage was started by aborrero@cumin2002 for host cloudlb2001-dev.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by aborrero@cumin2002 for host cloudlb2001-dev.codfw.wmnet with OS bullseye executed with errors:

  • cloudlb2001-dev (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by aborrero@cumin2002 for host cloudlb2001-dev.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by aborrero@cumin2002 for host cloudlb2001-dev.codfw.wmnet with OS bullseye executed with errors:

  • cloudlb2001-dev (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by aborrero@cumin2002 for host cloudlb2001-dev.codfw.wmnet with OS bullseye

Cookbook cookbooks.sre.hosts.reimage started by aborrero@cumin2002 for host cloudlb2001-dev.codfw.wmnet with OS bullseye completed:

  • cloudlb2001-dev (PASS)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh bullseye OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202301271125_aborrero_321797_cloudlb2001-dev.out
    • Checked BIOS boot parameters are back to normal
    • updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> active