The ES cluster has two sets of hardware, old hardware in racks A3 and C5 and new hardware in racks D3 and D4. We would like to move two servers from the D row such that we have at least one server with new hardware in each row for usage as the cluster master node.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
elasticsearch: re-enter rack info for elastic1006 | operations/puppet | production | +1 -1 |
Revisions and Commits
rOPUP Wikimedia Puppet | |||
rOPUP100e8a24683e elastic: update rack location for 1005 and 1030 | |||
rOPUP5bc81ac36871 elasticsearch: update row for master eligibles | |||
rOPUP128274c54242 elastic: update rack location for 1005 and 1030 |
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Gehel | T112556 Only use newer (elastic10{16..47}) servers as master capable elasticsearch nodes | |||
Resolved | • chasemp | T112559 Swap two elasticsearch servers in row D with an elasticsearch server in racks A3 and C5. |
Event Timeline
Any particular servers you would like move or just take 2 that makes the most sense?
I am thinking
elastic1031 => A3
elastic1030 => A3
elastic1006 => D4
elastic1005 => D4
In terms of exact servers, whichever makes the most sense.I would like to see servers moved into both A and C racks for availability reasons, three masters capable nodes in three rows seems best.
Hey chris! There a few that are more sensitive (can't be missing at the same time), and I'm hoping we can do it in 2 phases (so it's only 2 missing at a time). I'll try to sync up with discovery gents and further comment.
Thanks!
For the moment none of these three can be missing at the same time:
hieradata/hosts/elastic1001.yaml:elasticsearch::master_eligible: true
hieradata/hosts/elastic1008.yaml:elasticsearch::master_eligible: true
hieradata/hosts/elastic1013.yaml:elasticsearch::master_eligible: true
which seems compatible with the thoughts here I just wanted to be sure :)
Also I think this will need to be updated at this time:
hieradata/regex.yaml
es_rack_a3:
__regex: !ruby/regexp /^elastic100[0-6]\.eqiad\.wmnet$/ elasticsearch::rack: A3 elasticsearch::row: A
es_rack_c5:
__regex: !ruby/regexp /^elastic10(0[7-9]|1[0-2])\.eqiad\.wmnet$/ elasticsearch::rack: C5 elasticsearch::row: C
es_rack_d3:
__regex: !ruby/regexp /^elastic10(1[3-9]|2[0-2])\.eqiad\.wmnet$/ elasticsearch::rack: D3 elasticsearch::row: D
es_rack_d4:
__regex: !ruby/regexp /^elastic10(2[3-9]|3[01])\.eqiad\.wmnet$/ elasticsearch::rack: D4 elasticsearch::row: D
We made a plan to do 1030 and 1005 tomorrow and then let thing stabilize before going further. We want to get started at 10:30 am eastern
@dcausse @EBernhardson @Cmjohnson
I will ban these nodes and remove from LVS today.
Change 242082 had a related patch set uploaded (by Giuseppe Lavagetto):
elasticsearch: re-enter rack info for elastic1006
Relocated elastic1006/1031 to appropriate racks, updated switch cfg, racktables. Corrected DNS and updated /etc/network/interfaces information on each server. Both are reachable via ssh. The on-site portion of this project has been completed.