The recent AB test for machine learned ranking was able to improve the average click position, but the clickthrough rate stayed relatively flat. This potentially suggests that while the machine learning is able to improve the results we already have, there may be an underlying recall issue preventing good results from being found. One way to investigate this will be to pull a list of the most popular queries that have high abandonment and look into them to see what is going on there.
Potentially this data can be extracted by looking at the frequency of distinct queries in the click logs vs frequency of those same distinct queries in the complete search logs. Doing some form of query normalization, like we do in mjolnir, may be helpful but perhaps too expensive to do on the full dataset of queries.