Currently we run 2 control planes as well as 3 etcd nodes per DC as ganeti VMs. We already hit limits in terms of IOPS on the etcd instances and we do scratch on the upper "limit" for memory on ganeti for the control planes (12GB currently).
We should draft a plan to migrate from the 2+3 ganeti instances to 3 hardware nodes (repurposing mw appservers) and co-locate a kubernetes master and etcd sever on each of them.
It should be possible to do this by adding the new control-planes/etcd nodes and remove the ganeti ones after.
In the spreadsheet at {T351074} I've reserved 3 R440 nodes per DC to be used as apiservers:
- mw2391
- mw2331
- mw2361
- mw1371
- mw1429
- mw1435
These should be renamed during reimage because of their special role in the cluster: https://wikitech.wikimedia.org/wiki/Server_Lifecycle#Rename_while_reimaging