Forking the Renumbering section of T327938: Codfw row A/B top-of-rack switch refresh to its own task:
To allow for renumbering some development will need to happen to support a "--renumber" toggle for the reimage cookbook, which should delete the hosts existing IP allocation and add a new one.
Renumbering presents additional challenges in terms of services running on the hosts, if they come back online with different IPs. A few things we need to consider (there are likely more):
- DNS needs to be updated, old entries can still be in DNS caches
- Is it possible to change the DNS TTLs in advance to help us here?
- We may have hardcoded IPs in puppet for certain things. Possibly the renumbering script could perform a git grep of the IP in multiple repositories to look for these (like the decommissioning cookbook):
- Puppet
- Puppet private
- Mediawiki-config
- Deployment charts
- homer-public
- DNS record resolved at catalog compile time by the Puppet master and those resolved for example by ferm at reload time (but could be any other service) will need update either forcing a puppet master or with a ferm reload or with a specific service reload/restart.
- Databases:
- DB grants are issued per-IP
- mediawiki connects to the DB via IP
- dbctl has the IPs of the servers and gives it to the mediawiki config stored in etcd
- Backend servers behind LVS: TBD
- Ganeti servers: depends on the whole Ganeti discussion