It is my theory that web crawlers may indirectly impair the performance and stability of the replica databases, among other services.
What happens I think is that the community will link to tools on-wiki, and then they are followed by web crawlers. The problem is that these tools fire off long-running queries.
XTools is a prime example. We have a long list of UAs for legit web crawlers that we block on the Apache level. See step #12 at https://wikitech.wikimedia.org/w/index.php?title=Tool:XTools#Building_a_new_instance (somewhat out of date). If we didn't do this, XTools would go down due to hitting the max connection limit. I also found these crawlers did not respect https://xtools.wmflabs.org/robots.txt.
We saw a similar issue with E-Book-Export-Reliability. Significant traffic from crawlers were hogging up resources, causing the tool to go down.
For similar reasons I put Tool-global-search behind a login wall. The Cloud Elastic service was still experimental and I didn't want crawlers and bots to impact stability.
I can only assume other tools suffer from crawler traffic, which in turn puts unnecessary load on our infrastructure. In particular, I wonder what the health of the replicas would be after blocking web crawlers.