I have a cloudcontrol node properly puppetized now, so building a new Buster host should be straightforward.
Unfortunately, rabbitmq on Buster refuses to talk to rabbitmq on Stretch. In order to avoid split-brain during the rebuilds, I'm going to follow these steps:
- Build new cloudcontrol hardware with Buster (cloudcontrol1005/cloudcontrol2004-dev)
- sync glance images to the new Buster host
- sync fernet keys to the new Buster host
- designate the new Buster host the active glance server/primary image store
- adjust the openstack_controllers hiera list to reference ONLY the new Buster host
- update service address CNAME (openstack.eqiad1.wikimediacloud.org) to point to new Buster host
- on nova_api database, 'update cell_mappings set transport_url=' for the new buster rabbit host
- wait a bit, ensure no traffic (either OpenStack API or Rabbit) is hitting the older hosts
- rebuild older hosts
- sync glance images back to rebuilt hosts
- on newly-built host /usr/bin/rsync -a --delete rsync://cloudcontrol2004-dev.wikimedia.org/keystonefernetkeys/* /etc/keystone/fernet-keys/
- sync fernet keys back to rebuilt hosts
- on existing host sudo -u glancesync rsync -ra /srv/glance/images/* glancesync@cloudcontrol2003-dev:/srv/glance/images/
- update openstack_controllers to reference all three hosts again
- restart ferm everywhere
- update the rabbitmq pool to include all three hosts
- update transport_url in the database to include all three hosts
Lots of monitoring things will likely freak out during this process but actual APIs should remain stable.