During recent cluster instabilities, we realized that the number of pending tasks raising is a good indication that the cluster is in trouble. As we don't collect this metric, there is a chance that we also see peaks in normal operation (unlikely), but the only way to know is to start collecting... Once we have graphs, we can think about putting alerting or adding this to our dashboards.
gehel@elastic2001:~$ curl -s https://search.svc.codfw.wmnet:9243/_cluster/health?pretty { "cluster_name" : "production-search-codfw", "status" : "yellow", "timed_out" : false, "number_of_nodes" : 24, "number_of_data_nodes" : 24, "active_primary_shards" : 2978, "active_shards" : 8848, "relocating_shards" : 0, "initializing_shards" : 5, "unassigned_shards" : 144, "delayed_unassigned_shards" : 0, ** "number_of_pending_tasks" : 0,** "number_of_in_flight_fetch" : 0 }