Currently we ask a single node to query all the other nodes and record the latency. This works fine under normal conditions, but under a network partition this request timed out and we didn't collect latency numbers for any of the servers. We already run a prometheus exporter per host, adjust it only query the local node.
- Update extra plugin rest api to support collecting latency metrics without any requests over the elasticsearch transport
- Deploy updated extra plugin to cluster
- Update prometheus exporter to collect per-node