Page MenuHomePhabricator

Monitor usage of in memory data structures used by elasticsearch
Closed, ResolvedPublic


Elasticsearch stores some data directly in the java heap.
We should monitor this usage because it can directly affect the gc.
The data we could collect is available via the node stats api:

  • everything under indices.segments
  • indices.fielddata
  • indices.request_cache
  • indices.completion
  • indices.query_cache.memory_size_in_bytes

Event Timeline

This is a task is a good candidate to be tackled by someone without full knowledge of the details of how we use elasticsearch. This is a simple adaptation of the script.

debt triaged this task as Medium priority.Sep 1 2016, 10:14 PM
debt moved this task from needs triage to This Quarter on the Discovery-Search board.
debt added a subscriber: debt.

This could be another data point in Grafana and might take a few hours to do.

Is indices.request_cache truly needed? A quick check with curl localhost:9200/_nodes/stats?groups=_all | jq '.nodes | to_entries | map({ key:, value: .value.indices.request_cache}) | from_entries' shows they are 0 across the board (on both the search and logstash prod clusters). Incoming patch covers the other items, but not this one.

Change 311848 had a related patch set uploaded (by EBernhardson):
Monitor usage of in-memory elasticsearch datastructures

Change 311848 merged by Gehel:
Monitor usage of in-memory elasticsearch datastructures

Code deployed, metrics are visible in Graphite for elastic1020 (other should follow once puppet runs).