Per parent ticket, we've run into some issues with the Relforge hosts (mostly due to their age):
- The current relforge hosts cannot reimage via cookbook, as it's an HP chassis (WMF hasn't bought them for years). Based on my work yesterday,
- A manual reimage adds about 2 hours per server.
- There are some delicate commands I have to run on the puppet server that I'd rather not.
- 1G network. We have to shuffle 1.1 TB around every time we reimage a host.
The Opensearch migration is a risky endeavor. We need the freedom to reimage the relforge cluster multiple times if necessary, so we can make sure the process is repeatable before we move on to the production clusters. As such, I've elected to repurpose elastic1104-1106 as Relforge hosts.
The current relforge hosts are already slated to be replaced in T382906, so this will move up the timetable a bit. We have plenty of capacity in eqiad, so it's not a huge deal to lose 3 hosts which will be backfilled in the next quarter anyway.