Currently this project is puppetized on wikitech via https://wikitech.wikimedia.org/wiki/Hiera:Project-proxy -- I'm going to fix that /after/ this fail-over is done.
The API service mentioned below is a uwsgi service called 'invisible_unicorn'.
These steps will not result in downtime:
[x] Create new eqiad1 proxy nodes, proxy-01 and proxy-02
[x] copy certs over by hand from nova-proxy-01
[x] Add proxy-01 and proxy-02 to $all_proxies, let puppet update
[x] ensure that redis is syncing properly between regions
[x] Update proxy DNS record for a test proxy, ensure that proxy-01 handles it correctly
[x] Update proxy DNS records to point to the eqiad1 proxy (proxy-01)
[x] test some more
[x] update hieradata/eqiad/profile/openstack/main/nova/network.yaml with the new active proxy IP
These steps will result in partial downtime with creating/deleting proxies:
[x] Set $active_proxy to point to proxy-01, let puppet update
[x] stop puppet and the API on novaproxy-01
[x] stop api on proxy-01, restore database (it's on NFS, available to all nodes), restart API there
[x] Update proxy endpoints in keystone to point to the new proxy
[x] Test!
Cleanup:
[x] move project-wide puppet off of wikitech and into horizon
[] Wait 24 hours for DNS caches to update
[] Shut down novaproxy-01 and novaproxy-02
[] Wait another few days before deleting