A cold restart of the cirrus elasticsearch eqiad cluster was done today, and had a few issues. During that time, logs collected by logstash dropped to almost 0.
The obvious link between those 2 elasticsearch clusters is the apifeature logging, which is sent to the cirrus cluster.
It seems strange that api feature would affect all logs. Maybe conenctions are timing out and consuming all logstash resources?
Related incident: https://wikitech.wikimedia.org/wiki/Incident_documentation/2017-09-20_Logstash