See T390215: Logstash is overwhelmed, we are having a lot of trouble with the sheer number of logs being ingested by our logstash infra. I did a quick check and around half of all of the logs are from ML infra: https://logstash.wikimedia.org/goto/1a2483a4a3958f77ea6df119d7b16a22 this is currently emitting around 1,000,000 logs per minute.
E.g. maybe for access requests, implement a sampling for anything that's 200?
Also a much simpler mitigation: I'm seeing that majority of the logs have empty message and empty log: https://logstash.wikimedia.org/app/discover#/doc/logstash-*/logstash-ml-1-7.0.0-1-2026.02.03?id=Cii-JJwBVE0pYbVvI1fq
Thank you!




