An intro on routed Ganeti can be found here: https://phabricator.wikimedia.org/phame/post/view/312/ganeti_on_modern_network_design/
Routed ganeti is already running in magru (and two test systems in codfw). Next we'll migrate esams to it
Esams currently consists of two separate Ganeti clusters in two different rows with two servers each.
row BY27: ganeti3005 and ganeti3007
- atlas3001
- bast3007
- doh3003
- durum3003
- ncredir3003
row BW27: ganeti3006 and ganeti3008
- doh3004
- durum3004
- install3003
- ncredir3004
- netflow3003
- prometheus3003
When the migration is completed, we'll have a common four node Ganeti cluster spanning the two rows (and would also have flexibility in case of potential row changes at the DC).
The migration path will look like the following:
- Announce that people move away from bast3007 and use the bastion in drmrs for now
- Deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/1180085
- Decom atlas3001 (will be re-added later)
- Decom bast3007 (will be re-added later)
- Move all VMs in ganeti3005 to ganeti3007
- Switch BY27 VMs to plain disk storage, i.e. disable DRBD for them.
During this initial period, the BY27 VMs are no longer redundant
- Allocate IPs for esams routed Ganeti - https://netbox.wikimedia.org/ipam/prefixes/?role_id=41&site_id=1
- Add allocated IPs to modules/network/data/data.yaml in Puppet - https://gerrit.wikimedia.org/r/c/operations/puppet/+/1180083
- Add ganeti "customer" to Homer with the esams ranges - https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1180081
- Manually create the first IPs in Netbox to be able to add the DNS PTRs includes
- Reimage ganeti3005 with routed Ganeti T403375
- Update ganeti3005 switch port to remove the trunked public vlan
- Setup routing between ganeti3005 and its ToR switch
- Create doh3005, durum3005 and ncredir3005 on routed Ganeti and fail over services
- Decom ncredir3004, netflow3003, prometheus3003
- Reimage ganeti3007 with routed Ganeti
- Update ganeti3007 switch port to remove the trunked public vlan
- Setup routing between ganeti3007 and its ToR switch
- Remove esams01 from Netbox sync
- Move all VMs on ganeti3006 to ganeti3008
- Switch BW27 VMs to plain disk storage, i.e. disable DRBD for them.
- Reimage ganeti3006 with routed Ganeti
- Update ganeti3006 switch port to remove the trunked public vlan
- Setup routing between ganeti3006 and its ToR switch
- Create atlas3001 - T403580
- Create bast3007 on routed Ganeti
- Switch ncredir3005, durum3005, doh3005 to DRBD
- Create doh3006 on routed Ganeti, when done decom doh3004
- Create durum3006 on routed Ganeti, when done decom durum3004
- Create ncredir3006 on routed Ganeti, when done decom ncredir3004
- Create install3004 on routed Ganeti, when done decom install3004
- Update DHCP relay config on the switches to point to the new install3004
- Point webproxy to the new install3004
- Create netflow3004 on routed Ganeti, when done decom netflow3003
- Update netflow config on esams network equipment to point to the new netflow3004
- Create prometheus3004 on routed Ganeti with insetup role and pass on to o11y to migrate existing metrics, when done decom prometheus3003
- Remove esams02 from Netbox sync
- When empty of running VMs, reimage ganeti3008 with routed Ganeti
- Update ganeti3008 switch port to remove the trunked public vlan
- Setup routing between ganeti3008 and its ToR switch