Analysis of the A/B test for Glent Method 0 (M0) has proven to be a bit difficult because of the large variability in user behavior between the buckets, which is dominated by the performance of the phrase suggester suggestions.
In order to be better able to see the impact of M0 (and M1 and M2 in the future), add "source" to the logging schema so that for any given suggestion shown to a user, we know which algorithm it comes from. This will allow us to evaluate phrase suggester perfomance against M0 performance per query, rather than in the aggregate (where the phrase suggester currently dominates).