Request Elasticsearch hardware for secondary CirrusSearch in codfw
Closed, ResolvedPublic


Basically we should get ~24ish nodes like the nice ones in eqiad. The cluster will be smaller but we can't afford 31 of the nice nodes and its not a great choice anyway because then the secondary dc's cluster would be way overpowered. Start with what we budgeted and go from there. If you get a choice always put extra cash into nicer (but not larger!) SSDs. And RAM. Those are life. LIFE!

Why 24 boxes here and not 23 or $number? Is there load analysis I can use to justify? I don't understand well enough to make the request.

I believe it was semi arbitrary and based on budget. We have 15 nice
machines in the eqiad cluster and 16 good machines and I guestimated that
half again as many would be enough but never ran any hard numbers. 24 may
be too many but it doesn't feel like too too many. If it is too few we can
lower the rescore window for phrases on queries sent to that cluster to
lower the load.

A more conservative approach would be to buy 16 and setup the cluster and
run load tests against it. If we feel we need more machines we can buy them

another consideration is disk utilization, we're roughly at 50% in eqiad ATM (each machine has raid1 2x500GB SSD raid0 2x300GB SSD) and it seems relatively stable over the last 9 months. assuming disk used stays the same 24 machines seem a good initial number, assuming the total disk used stays the same they'll be at ~75% disk space utilization

(procurement is tracked in RT #8524)

@RobH we should refresh the quote we for had elasticsearch hw in codfw in RT #8524, and quote larger (intel, supported by vendor) SSD for comparison (I think S3500 do 800G?)

s3500 max out at 800 gb, larger than that moves up to the s3700 series

I've requested updated quotes on and will follow up on them once they come back from Dell.

Since the Dell quote will involve a generation upgrade (so new mainboard and the like), there doesn't seem to be any reason not to get an HP quote for these as well. Once I have an updated Dell quote back for a baseline, I'll request the HP quote.

to clarify, I think it makes sense to quote 800G SSD and also 300G SSD for price comparison

Noting here that we are aware that there is some work required for us to do once the servers are ready, though. :-)

The quotes for this have been reviewed by myself, @chasemp, & @fgiunchedi and are in final management review/approval.

This order has been submitted, and I'm awaiting shipment updates from the vendor.

This work is tracked in T109734.

ETA is today.

requested and answered see T111080