Page MenuHomePhabricator

Rebuild cloudcontrol hosts with Debian buster
Closed, ResolvedPublic

Description

I have a cloudcontrol node properly puppetized now, so building a new Buster host should be straightforward.

Unfortunately, rabbitmq on Buster refuses to talk to rabbitmq on Stretch. In order to avoid split-brain during the rebuilds, I'm going to follow these steps:

  • Build new cloudcontrol hardware with Buster (cloudcontrol1005/cloudcontrol2004-dev)
  • sync glance images to the new Buster host
  • sync fernet keys to the new Buster host
  • designate the new Buster host the active glance server/primary image store
  • adjust the openstack_controllers hiera list to reference ONLY the new Buster host
  • update service address CNAME (openstack.eqiad1.wikimediacloud.org) to point to new Buster host
  • on nova_api database, 'update cell_mappings set transport_url=' for the new buster rabbit host
  • wait a bit, ensure no traffic (either OpenStack API or Rabbit) is hitting the older hosts
  • rebuild older hosts
  • sync glance images back to rebuilt hosts
    1. on newly-built host /usr/bin/rsync -a --delete rsync://cloudcontrol2004-dev.wikimedia.org/keystonefernetkeys/* /etc/keystone/fernet-keys/
  • sync fernet keys back to rebuilt hosts
    1. on existing host sudo -u glancesync rsync -ra /srv/glance/images/* glancesync@cloudcontrol2003-dev:/srv/glance/images/
  • update openstack_controllers to reference all three hosts again
  • restart ferm everywhere
  • update the rabbitmq pool to include all three hosts
  • update transport_url in the database to include all three hosts

Lots of monitoring things will likely freak out during this process but actual APIs should remain stable.

Event Timeline

Change 594960 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Make cloudcontrol2004-dev the one openstack and glance controller

https://gerrit.wikimedia.org/r/594960

Change 594960 merged by Andrew Bogott:
[operations/puppet@production] Make cloudcontrol2004-dev the one openstack and glance controller

https://gerrit.wikimedia.org/r/594960

Andrew updated the task description. (Show Details)
Andrew updated the task description. (Show Details)

Change 595196 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Make cloudcontrol1005 a cloudcontrol node

https://gerrit.wikimedia.org/r/595196

Change 595196 merged by Andrew Bogott:
[operations/puppet@production] Make cloudcontrol1005 a cloudcontrol node

https://gerrit.wikimedia.org/r/595196

Change 595197 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Add initial hiera hosts defs for cloudcontrol1005

https://gerrit.wikimedia.org/r/595197

Change 595197 merged by Andrew Bogott:
[operations/puppet@production] Add initial hiera hosts defs for cloudcontrol1005

https://gerrit.wikimedia.org/r/595197

Mentioned in SAL (#wikimedia-operations) [2020-05-08T18:12:14Z] <andrewbogott> reprepro copy buster-wikimedia stretch-wikimedia prometheus-openstack-exporter for T252121

Change 595207 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] mysql misc: add access for cloudcontrol1005 to m5-master

https://gerrit.wikimedia.org/r/595207

Change 595210 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] misc dbs: temporarily add a hacked openestack_controllers setting

https://gerrit.wikimedia.org/r/595210

Change 595210 merged by Andrew Bogott:
[operations/puppet@production] misc dbs: temporarily add a hacked openstack_controllers setting

https://gerrit.wikimedia.org/r/595210

Change 595221 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Keystone: update fernet key rotation plan for 3 hosts

https://gerrit.wikimedia.org/r/595221

Change 595221 merged by Andrew Bogott:
[operations/puppet@production] Keystone: update fernet key rotation plan for 3 hosts

https://gerrit.wikimedia.org/r/595221

Change 595227 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] OpenStack: move all openstack API support to cloudcontrol1005

https://gerrit.wikimedia.org/r/595227

Change 595229 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] Openstack: move API traffic to cloudcontrol1005

https://gerrit.wikimedia.org/r/595229

Change 595281 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Change cloudcontrol2001 and 2003 to buster

https://gerrit.wikimedia.org/r/595281

Change 595282 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Move cloudweb api service to cloudcontrol2004

https://gerrit.wikimedia.org/r/595282

Change 595281 merged by Andrew Bogott:
[operations/puppet@production] Change cloudcontrol2001 and 2003 to buster

https://gerrit.wikimedia.org/r/595281

Change 595282 merged by Andrew Bogott:
[operations/puppet@production] Move cloudweb api service to cloudcontrol2004

https://gerrit.wikimedia.org/r/595282

Mentioned in SAL (#wikimedia-cloud) [2020-05-09T16:53:30Z] <andrewbogott> rebuilding cloudcontrol2001-dev and 2003-dev with buster for T252121

Andrew updated the task description. (Show Details)

Change 595583 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Add cloudcontrol2001-dev and 2003-dev back to openstack_controllers

https://gerrit.wikimedia.org/r/595583

Change 595583 merged by Andrew Bogott:
[operations/puppet@production] Add cloudcontrol2001-dev and 2003-dev back to openstack_controllers

https://gerrit.wikimedia.org/r/595583

Change 595227 merged by Andrew Bogott:
[operations/puppet@production] OpenStack: move all openstack API support to cloudcontrol1005

https://gerrit.wikimedia.org/r/595227

Change 595229 merged by Andrew Bogott:
[operations/dns@master] Openstack: move API traffic to cloudcontrol1005

https://gerrit.wikimedia.org/r/595229

Change 595949 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Rebuild cloudcontrol1003/1004 with Buster

https://gerrit.wikimedia.org/r/595949

Change 595949 merged by Andrew Bogott:
[operations/puppet@production] Rebuild cloudcontrol1003/1004 with Buster

https://gerrit.wikimedia.org/r/595949

Change 595970 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Add cloudcontrol1003 and 1004 back to openstack_controllers list

https://gerrit.wikimedia.org/r/595970

Change 595970 merged by Andrew Bogott:
[operations/puppet@production] Add cloudcontrol1003 and 1004 back to openstack_controllers list

https://gerrit.wikimedia.org/r/595970

Change 595979 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Glance: make cloudcontrol1003 the primary Glance host again

https://gerrit.wikimedia.org/r/595979

Change 595980 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] Revert "Openstack: move API traffic to cloudcontrol1005"

https://gerrit.wikimedia.org/r/595980

Change 595979 merged by Andrew Bogott:
[operations/puppet@production] Glance: make cloudcontrol1003 the primary Glance host again

https://gerrit.wikimedia.org/r/595979

Change 595980 merged by Andrew Bogott:
[operations/dns@master] Revert "Openstack: move API traffic to cloudcontrol1005"

https://gerrit.wikimedia.org/r/595980

Change 595207 merged by Marostegui:
[operations/puppet@production] mysql misc: add access for cloudcontrol1005 to m5-master

https://gerrit.wikimedia.org/r/595207