Page MenuHomePhabricator

Rename cloudcontrol200[789]-dev.codfw to cloudrabbit200[123]-dev.codfw
Closed, ResolvedPublic

Description

This hardware is currently idle, let's repurpose as rabbitmq servers.

Event Timeline

Change #1138484 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] site.pp: add entries for new codfw1dev cloudrabbit servers

https://gerrit.wikimedia.org/r/1138484

Change #1138484 merged by Andrew Bogott:

[operations/puppet@production] site.pp: add entries for new codfw1dev cloudrabbit servers

https://gerrit.wikimedia.org/r/1138484

Change #1141564 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cloudrabbit200x-dev: fix fqdn

https://gerrit.wikimedia.org/r/1141564

Change #1141564 merged by Andrew Bogott:

[operations/puppet@production] cloudrabbit200x-dev: fix fqdn

https://gerrit.wikimedia.org/r/1141564

Change #1141566 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Add cloudrabbit200[1-3]-dev to preseed

https://gerrit.wikimedia.org/r/1141566

Change #1141566 merged by Andrew Bogott:

[operations/puppet@production] Add cloudrabbit200[1-3]-dev to preseed

https://gerrit.wikimedia.org/r/1141566

Cookbook cookbooks.sre.hosts.rename started by andrew@cumin1002 from cloudcontrol2007-dev to cloudrabbit2001-dev completed:

  • cloudcontrol2007-dev (PASS)
    • ✔️ Downtimed host on Icinga/Alertmanager
    • ✔️ Disabled puppet and its timer
    • ✔️ Disabled debmonitor-client timer
    • ✔️ Netbox updated
    • ✔️ BMC Hostname updated
    • ✔️ DNS updated
    • ✔️ Switch description updated
    • ✔️ Removed from DebMonitor
    • ✔️ Removed from Puppet master and PuppetDB
    • Rename completed 👍 - now please run the re-image cookbook on the new name with --new

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2001-dev.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.rename started by andrew@cumin1002 from cloudcontrol2008-dev to cloudrabbit2002-dev completed:

  • cloudcontrol2008-dev (PASS)
    • ✔️ Downtimed host on Icinga/Alertmanager
    • ✔️ Disabled puppet and its timer
    • ✔️ Disabled debmonitor-client timer
    • ✔️ Netbox updated
    • ✔️ BMC Hostname updated
    • ✔️ DNS updated
    • ✔️ Switch description updated
    • ✔️ Removed from DebMonitor
    • ✔️ Removed from Puppet master and PuppetDB
    • Rename completed 👍 - now please run the re-image cookbook on the new name with --new

Cookbook cookbooks.sre.hosts.rename started by andrew@cumin1002 from cloudcontrol2009-dev to cloudrabbit2003-dev completed:

  • cloudcontrol2009-dev (PASS)
    • ✔️ Downtimed host on Icinga/Alertmanager
    • ✔️ Disabled puppet and its timer
    • ✔️ Disabled debmonitor-client timer
    • ✔️ Netbox updated
    • ✔️ BMC Hostname updated
    • ✔️ DNS updated
    • ✔️ Switch description updated
    • ✔️ Removed from DebMonitor
    • ✔️ Removed from Puppet master and PuppetDB
    • Rename completed 👍 - now please run the re-image cookbook on the new name with --new

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2002-dev.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2003-dev.codfw.wmnet with OS bookworm

Change #1141568 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Remove refs to cloudcontrol200[789]

https://gerrit.wikimedia.org/r/1141568

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2001-dev.codfw.wmnet with OS bookworm completed:

  • cloudrabbit2001-dev (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505050408_andrew_2148586_cloudrabbit2001-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2003-dev.codfw.wmnet with OS bookworm completed:

  • cloudrabbit2003-dev (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505050415_andrew_2149289_cloudrabbit2003-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2002-dev.codfw.wmnet with OS bookworm completed:

  • cloudrabbit2002-dev (PASS)
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505050419_andrew_2149282_cloudrabbit2002-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1141893 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Make cloudrabbit200[123] into rabbitmq nodes

https://gerrit.wikimedia.org/r/1141893

Change #1141893 merged by Andrew Bogott:

[operations/puppet@production] Make cloudrabbit200[123] into rabbitmq nodes

https://gerrit.wikimedia.org/r/1141893

Change #1141896 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Make cloudrabbit200[123] into rabbitmq nodes

https://gerrit.wikimedia.org/r/1141896

Change #1141568 merged by Andrew Bogott:

[operations/puppet@production] Remove refs to cloudcontrol200[789]

https://gerrit.wikimedia.org/r/1141568

Change #1141896 merged by Andrew Bogott:

[operations/puppet@production] Make cloudrabbit200[123] into rabbitmq nodes

https://gerrit.wikimedia.org/r/1141896

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2003-dev.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2002-dev.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2001-dev.codfw.wmnet with OS bookworm

Change #1142684 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/dns@master] wikimediacloud.org: move codfw1dev rabbitmq cnames

https://gerrit.wikimedia.org/r/1142684

Change #1142687 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] codfw1dev rabbit config: remove a comment that is no longer true

https://gerrit.wikimedia.org/r/1142687

Change #1142688 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] codfw1dev: remove rabbitmq from cloudcontrol nodes

https://gerrit.wikimedia.org/r/1142688

Change #1142697 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] rabbitmq: add hiera role config for codfw1dev

https://gerrit.wikimedia.org/r/1142697

Change #1142697 merged by Andrew Bogott:

[operations/puppet@production] rabbitmq: add hiera role config for codfw1dev

https://gerrit.wikimedia.org/r/1142697

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2002-dev.codfw.wmnet with OS bookworm executed with errors:

  • cloudrabbit2002-dev (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505062120_andrew_3308533_cloudrabbit2002-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console cloudrabbit2002-dev.codfw.wmnet" to get a root shell, but depending on the failure this may not work.

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2003-dev.codfw.wmnet with OS bookworm executed with errors:

  • cloudrabbit2003-dev (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505062105_andrew_3306350_cloudrabbit2003-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console cloudrabbit2003-dev.codfw.wmnet" to get a root shell, but depending on the failure this may not work.

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2001-dev.codfw.wmnet with OS bookworm executed with errors:

  • cloudrabbit2001-dev (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505062108_andrew_3309326_cloudrabbit2001-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console cloudrabbit2001-dev.codfw.wmnet" to get a root shell, but depending on the failure this may not work.

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2001-dev.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2002-dev.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudrabbit2003-dev.codfw.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2003-dev.codfw.wmnet with OS bookworm completed:

  • cloudrabbit2003-dev (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505062202_andrew_3371659_cloudrabbit2003-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2001-dev.codfw.wmnet with OS bookworm completed:

  • cloudrabbit2001-dev (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505062205_andrew_3371708_cloudrabbit2001-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudrabbit2002-dev.codfw.wmnet with OS bookworm completed:

  • cloudrabbit2002-dev (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202505062210_andrew_3371677_cloudrabbit2002-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1142684 merged by Andrew Bogott:

[operations/dns@master] wikimediacloud.org: move codfw1dev rabbitmq cnames

https://gerrit.wikimedia.org/r/1142684

Change #1142731 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/dns@master] wikimediacloud.org: move codfw1dev rabbitmq cnames

https://gerrit.wikimedia.org/r/1142731

Change #1142731 merged by Andrew Bogott:

[operations/dns@master] wikimediacloud.org: move codfw1dev rabbitmq cnames again

https://gerrit.wikimedia.org/r/1142731

Change #1142687 merged by Andrew Bogott:

[operations/puppet@production] codfw1dev rabbit config: remove a comment that is no longer true

https://gerrit.wikimedia.org/r/1142687

Change #1142688 merged by Andrew Bogott:

[operations/puppet@production] codfw1dev: remove rabbitmq from cloudcontrol nodes

https://gerrit.wikimedia.org/r/1142688

Change #1142740 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] codfw1dev rabbitmq: remove contactgroups: wmcs-team-email from role hiera

https://gerrit.wikimedia.org/r/1142740

Change #1142740 merged by Andrew Bogott:

[operations/puppet@production] codfw1dev rabbitmq: remove contactgroups: wmcs-team-email from role hiera

https://gerrit.wikimedia.org/r/1142740

Andrew claimed this task.

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudvirt1072.eqiad.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudvirt1072.eqiad.wmnet with OS bookworm executed with errors:

  • cloudvirt1072 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Add puppet_version metadata (7) to Debian installer
    • Checked BIOS boot parameters are back to normal
    • The reimage failed, see the cookbook logs for the details. You can also try typing "sudo install-console cloudvirt1072.eqiad.wmnet" to get a root shell, but depending on the failure this may not work.