We know from our Satisfaction schema that users are much more likely to click the second result than the third, but we don't know if it is because the second result is actually better or if people gravitate to the top results. A fairly simple AB test swapping the second and third results should be able to shed some light on how important the ranking of our top 3 search results are.
@mpopov a few ideas from a meeting i had yesterday with some external search people:
- We might want to test out swapping 2 & 3, and separately swapping 3 & 4, as an A/B/C test.
- The aggregated numbers might be relevant, but most interesting will be queries that exist in both the A and B side. As such it might be usefull to use a larger sample size to ensure we have some overlap.