logstash1002 hit an OOM today and then fell out of the cluster. In elasticsearch 2.3 aggregations are done on the machine running the queries, which is always 1001-3, these might be a bit more memory hungry than facets on 1.7 were. The heap is currently set to 2G and 1001-3 all have 5G+ unused (completely, in addition to a 4-5G disk cache). We can likely double the amount of memory available to these instances without any concern.
Customize query in gerrit
|operations/puppet||production||+1 -1||Increase elasticsearch heap on logstash routing instances|