While talking about and looking at the dashboards today, particularly the ZRR on the Azerbajani Wikipedia, an idea that came up was that it would be nice to see key metrics for "well-behaved searchers".
This is a heuristic we use when gathering data for analysis and manual tagging. The idea is to exclude not only bots, but also weirdos like the Discovery Search team—who are known to issue hundreds of queries in a day without clicking on any results—and other outliers. The four elements we've used, not all of which may be relevant to dashboards, are:
- Query came from the search box on <wiki>.wikipedia.org
- Exclude any IP that made more than 30 queries per day
- Include not more than one query from any given IP for any given day
- Only the <wiki>_content index was searched (except for wikis that search multiple indexes by default)
(1) and (2) seem reasonable for dashboards. (3) limits the input of every individual searcher to one query—could be one session, or might not be appropriate for at all for dashboards.
(4) might be hard to implement generically because some wikis search multiple indices by default, and maintaining info on this across all projects sounds like a pain.
 From Oct 23-25, 2016, the ZRR on azwiki jumped from 30-35% to ~97% and has been holding steady since. @EBernhardson checked the database and it looks like there's probably a bot that or other API user that's driving up the ZRR.