Page MenuHomePhabricator

Run Null A/B test for DYM suggestions
Closed, ResolvedPublic

Description

Analysis of the A/B test for Glent Method 0 (M0) has proven to be a bit difficult because of the large variability in user behavior between the buckets, which is dominated by the performance of the phrase suggester suggestions.

In order to get a better idea of the baseline variability of the phrase suggester (which appears to be all over the place, and which dominates the composite suggestion performance), use the control bucket data from the current A/B to generate a bootstrapped estimate of the variance of the key DYM metrics, as well as the range of common values.

(If the variability is high and we are feeling ambitious, we could also discuss a follow-up task to try to find the outlying sources of high variability, if any—such as web page–scraping bots or shared IP addresses—and filter them.)

Event Timeline

dcausse triaged this task as Medium priority.Jun 22 2020, 7:59 AM
dcausse moved this task from needs triage to ML & Data Pipeline on the Discovery-Search board.
TJones claimed this task.

Closing this because we have already done some reporting and dealt with some of the difficulties there. We don't really need the null A/B report anymore.