Page MenuHomePhabricator

Logstash hardware expansion
Closed, ResolvedPublic

Description

We have budgeted hardware for a logstash expansion this fiscal year for codfw. This task will be used to track said expansion and figure out what we need.

ATM logstash storage hosts are 2x CPU E5-2450 v2 @ 2.50GHz with 128G of ram and 4x1TB HDD in software raid0, their warranty expired back in April.

On these hosts elasticsearch is set to use 30GB of ram and the rest is used for caching (ATM between 40 and 90 GB).

Event Timeline

fgiunchedi triaged this task as Medium priority.Aug 30 2018, 3:08 PM
fgiunchedi created this task.

I did a quick spreadsheet with some back of envelope calculations for Logstash disk requirements at https://docs.google.com/spreadsheets/d/18RJKd5-bF3IiTLt8jsxJcTwq4dgc9fC7euUw8QPUnFQ/edit

A typical day in August 2018 has been at around 700 logs per second with 1100 bytes per log. That requires about 4.5TB on disk with split retention (15 days at 3x and 15 days at 2x). The bytes per log figure increases when we're indexing logs with a high number of fields, though it is generally around 1k.

I think planning hardware expansion for this fiscal year's budget considering 6x the current usage seems appropriate. For sure for a fiscal year, possibly more without requiring further expansions. The factors that play into the 6x figure are that we'll be adding more producers to Logstash, natural increase in logs/s ingested, some headroom for spammy log producers and the ability to extend retention days if we want.

If for some reason we're under-provisioning (or over-utilizing) we have some knobs that are cheap to tune, namely:

  • Drop older indices to free up space
  • Further reduce replication factor for newer indices to free up space

With 6x the current usage we're at 27TB required. Hardware wise I think we can keep the specs and quantities (3x) in the task description (modulo the 1TB hard disks).

Option 1
We're using three hosts already that went just out of warranty, one option is to retrofit those with 4x4TB disks (these are dell r420, should be able to take four LFF disks). In eqiad this option would be more than enough to satisfy the requirements above for 1 (possibly 2) fiscal years, after 2+ years we'd want to replace the hosts anyways.

Option 2
Another option in eqiad would be to provision completely new hosts, we'll have to get quotes for codfw regardless (and we have budget for codfw). This option will be significantly more expensive than buying 12x4TB disks for eqiad even when factoring in the engineering time in installing the disks and reimaging the hosts.

If for some reason we're under-provisioning (or over-utilizing) we have some knobs that are cheap to tune, namely:

  • Drop older indices to free up space
  • Further reduce replication factor for newer indices to free up space

With 6x the current usage we're at 27TB required. Hardware wise I think we can keep the specs and quantities (3x) in the task description (modulo the 1TB hard disks).

I'm curious if 4x4TB 7200RPM(?) spindles in software RAID-0 will be enough to satisfy IO demand from elasticsearch as log throughput grows from the current ~700/sec to ~4200/sec (using 6x figure)

One approach to buy ourselves some additional flexibility in this area is deploying "indexer" systems with high performance disks (SSD). These systems would host the elasticsearch indices being actively written to, with a curator job to transition indicies to long-term spindle storage as they age.

Assuming 64GB as the currently daily index size (actual daily indices vary from 44-76GB), a 6x increase brings us to approx 384GB/day. A pair of systems with 4x1TB SSD in RAID-10 (to avoid needing double ES replicas at the index tier) would provide 2TB redundant storage (local redundancy via raid, and host redundancy via 1x ES replica) -- enough to index approx 5 days worth of logs to SSD (at 6x current usage) before needing to transition to spinning disk.

If for some reason we're under-provisioning (or over-utilizing) we have some knobs that are cheap to tune, namely:

  • Drop older indices to free up space
  • Further reduce replication factor for newer indices to free up space

With 6x the current usage we're at 27TB required. Hardware wise I think we can keep the specs and quantities (3x) in the task description (modulo the 1TB hard disks).

I'm curious if 4x4TB 7200RPM(?) spindles in software RAID-0 will be enough to satisfy IO demand from elasticsearch as log throughput grows from the current ~700/sec to ~4200/sec (using 6x figure)

One approach to buy ourselves some additional flexibility in this area is deploying "indexer" systems with high performance disks (SSD). These systems would host the elasticsearch indices being actively written to, with a curator job to transition indicies to long-term spindle storage as they age.

Indeed, with today's workload the host cache helps a lot but with increasing throughput SSDs will help for sure.

Assuming 64GB as the currently daily index size (actual daily indices vary from 44-76GB), a 6x increase brings us to approx 384GB/day. A pair of systems with 4x1TB SSD in RAID-10 (to avoid needing double ES replicas at the index tier) would provide 2TB redundant storage (local redundancy via raid, and host redundancy via 1x ES replica) -- enough to index approx 5 days worth of logs to SSD (at 6x current usage) before needing to transition to spinning disk.

Agreed having recent indices on SSD would be nice, we might even be able to combine HDD and SSD on the same hosts.

Assuming 64GB as the currently daily index size (actual daily indices vary from 44-76GB), a 6x increase brings us to approx 384GB/day. A pair of systems with 4x1TB SSD in RAID-10 (to avoid needing double ES replicas at the index tier) would provide 2TB redundant storage (local redundancy via raid, and host redundancy via 1x ES replica) -- enough to index approx 5 days worth of logs to SSD (at 6x current usage) before needing to transition to spinning disk.

Agreed having recent indices on SSD would be nice, we might even be able to combine HDD and SSD on the same hosts.

That's an interesting idea. I wonder how we could control which indices reside on which storage type within the ES instance on a host? The method that I'm aware of involves elasticsearch node tags (and a modified logstash index template) which are set at the elasticsearch instance level. I found some useful alebit aging discussion about this at https://stackoverflow.com/questions/31287950/elasticsearch-multiple-data-directories-choose-where-to-place-the-index which didn't look too promising.

Technically we could explore multiple ES instances per-host to support mixed disk server configuration, but I can imagine that significantly complicating puppetization/services/monitoring/etc.

Technically we could explore multiple ES instances per-host to support mixed disk server configuration, but I can imagine that significantly complicating puppetization/services/monitoring/etc.

Indeed it'd complicate things a bit, the groundwork though is in place already thanks to work ongoing in T198351: Refactor puppet to support multiple elasticsearch instances on same node so I think it'd be possible at the expense of complication

Technically we could explore multiple ES instances per-host to support mixed disk server configuration, but I can imagine that significantly complicating puppetization/services/monitoring/etc.

Indeed it'd complicate things a bit, the groundwork though is in place already thanks to work ongoing in T198351: Refactor puppet to support multiple elasticsearch instances on same node so I think it'd be possible at the expense of complication

Good to know, though tbh I'm struggling to see the upside to a mixed storage layout. Seem to me it would provide fewer hardware resources while at the same time increasing complexity and time involved for setup/maintenance.

Similarly to T205873#4643423 a realistic near-term approach to support T205849 is looking like a phased one. In this case..

Phase 1 - Add 3 new elasticsearch data hosts in codfw (near-term these would host Kafka as well, see T205873), while making do with the existing base elasticsearch hardware we have in eqiad (please see below re: disk upgrades).
Phase 2 - Refresh the 3 elasticsearch data hosts in eqiad with new hardware T210498
Phase 3 - Add SSD based elasticsearch indexer hosts (minimum 2 per site)

With regard to upgrading the existing eqiad hosts to 4TB disks, I think this depends on the timeframe in which phase 2 could happen. Optimally phase 1 and 2 would happen at the same time, but if the timeframe to procure replacement elasticsearch servers in eqiad is within another quarter (or 2/3?) we could consider making the best of the existing eqiad hardware until then.

herron mentioned this in Unknown Object (Task).Oct 24 2018, 8:24 PM

Procurement for eqiad hw is at {T210498} (Phase 2)