Page MenuHomePhabricator

Use cloudbackup100[12]-dev for cinder backup test/dev
Closed, ResolvedPublic

Description

According to the parent task, we shoud not have anything running cinder-backups, but these Ganeti VMs are. It seems like they were just forgotten and need to be decom'd.

Event Timeline

I think the right thing here is to update these to replicate the behavior of cloudbackup200[12]. I'll have a look at that.

Andrew renamed this task from Maybe decom cloudbackup100[12]-dev to Use cloudbackup100[12]-dev for cinder backup test/dev.Apr 2 2024, 9:20 PM

Change #1016447 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] Make cloudbackup200[12]-dev into codfw1dev cinder backup hosts

https://gerrit.wikimedia.org/r/1016447

Change #1016446 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cinder backups: move schedule config from a template into hiera

https://gerrit.wikimedia.org/r/1016446

Change #1016446 merged by Andrew Bogott:

[operations/puppet@production] cinder backups: move schedule config from a template into hiera

https://gerrit.wikimedia.org/r/1016446

Change #1016447 merged by Andrew Bogott:

[operations/puppet@production] Make cloudbackup200[12]-dev into codfw1dev cinder backup hosts

https://gerrit.wikimedia.org/r/1016447

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudbackup1001-dev.eqiad.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudbackup1001-dev.eqiad.wmnet with OS bookworm completed:

  • cloudbackup1001-dev (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202404041654_andrew_690147_cloudbackup1001-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm completed:

  • cloudbackup1002-dev (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202404041804_andrew_701926_cloudbackup1002-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by andrew@cumin1002 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm

Cookbook cookbooks.sre.hosts.reimage started by andrew@cumin1002 for host cloudbackup1002-dev.eqiad.wmnet with OS bookworm completed:

  • cloudbackup1002-dev (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present and deleted any certificates
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via gnt-instance
    • Host up (Debian installer)
    • Add puppet_version metadata to Debian installer
    • Set boot media to disk
    • Host up (new fresh bookworm OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202404051519_andrew_863461_cloudbackup1002-dev.out
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Change #1017319 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] role:cinder_backups: include full env scripts in codfw1dev

https://gerrit.wikimedia.org/r/1017319

Change #1017319 merged by Andrew Bogott:

[operations/puppet@production] role:cinder_backups: include full env scripts in codfw1dev

https://gerrit.wikimedia.org/r/1017319

Change #1017332 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cloudbackup100[12]-dev: include ceph admin creds

https://gerrit.wikimedia.org/r/1017332

Change #1017332 merged by Andrew Bogott:

[operations/puppet@production] cloudbackup100[12]-dev: include ceph admin creds

https://gerrit.wikimedia.org/r/1017332

Change #1017342 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cloudbackup: some host-specific enable_v2_messenger overrides

https://gerrit.wikimedia.org/r/1017342

Change #1017342 merged by Andrew Bogott:

[operations/puppet@production] cloudbackup: remove some host-specific enable_v2_messenger overrides

https://gerrit.wikimedia.org/r/1017342

Change #1017347 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cinder_backups: add some real work to codfw1dev backups

https://gerrit.wikimedia.org/r/1017347

Change #1017347 merged by Andrew Bogott:

[operations/puppet@production] cinder_backups: add some real work to codfw1dev backups

https://gerrit.wikimedia.org/r/1017347

Change #1017351 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] cinder_backups: apply codfw1dev env files for eqiad backup hosts

https://gerrit.wikimedia.org/r/1017351

Change #1017351 merged by Andrew Bogott:

[operations/puppet@production] cinder_backups: apply codfw1dev env files for eqiad backup hosts

https://gerrit.wikimedia.org/r/1017351

Change #1017354 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[operations/puppet@production] designate: move codfw1dev settings to 'common' for cross-site access

https://gerrit.wikimedia.org/r/1017354

Change #1017356 had a related patch set uploaded (by Andrew Bogott; author: Andrew Bogott):

[labs/private@master] openstack: move a bunch of codfw1dev passwords from 'codfw' to 'common'

https://gerrit.wikimedia.org/r/1017356

Change #1017356 merged by Andrew Bogott:

[labs/private@master] openstack: move a bunch of codfw1dev passwords from 'codfw' to 'common'

https://gerrit.wikimedia.org/r/1017356

Change #1017354 merged by Andrew Bogott:

[operations/puppet@production] designate: move codfw1dev settings to 'common' for cross-site access

https://gerrit.wikimedia.org/r/1017354