Saturday August 6 from 7:55 to 8:15am UTC, we have experienced a significant slow down on elasticsearch eqiad cluster. The issue started with curl pool rejection before 7:55am on mediawiki API servers and peaked with threadpool rejections on elasticsearch. The issue solved itself with no external interaction. The following days, multiple spikes on curl rejections from API server were observed.
We observed higher than usual system load on all but 3 elasticsearch servers during the issue. The 3 servers not having higher load are 3 servers not hosting any shard from enwiki. We suspect that the issue is related to more expensive enwiki queries being run at that time.
Investigation are ongoing with the help of @dcausse.