Our pageview pipeline labels as “user” traffic many requests that we know are actually coming from bots that are crawling our site, the lack of ability for us to be able to classify this requests as automated in origin leads to our stats about pageviews (specially top pageviews) being distorted. At the time of this writing our percentage of bot requests is said to be about 20%, in reality, it is probably quite a bit higher. As much as 5-8% higher overall per our research on this matter. This is the parent task to keep track of the work to deploy the "high volume bot spike" detection code.
The bot spikes we are after look like this: https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&start=2018-11&end=2019-10&pages=Line_shaft
They are sharp and large in term of traffic.
Also see recent bot spikes on hungarian wikipedia: T237282