What it sounds like; we need to work out if, in a week, we could collect enough unique users to perform an analysis on a test with 3 test buckets. It may have to wait until we can replicate the logic.
What are we measuring?
- If we're measuring nonzero results rate (our baseline is like 80%) and 81% is the minimum we'd be satisfied with, then we need 41,932 per group http://www.evanmiller.org/ab-testing/sample-size.html#!80;95;5;1;0
- But if we're measuring clickthrough rate (CTR) – which I think we are using SearchSatisfaction schema and not Cirrus logs – our baseline is 15% and we want to see at least 17% CTR in a test group, then it's 8,484 per group http://www.evanmiller.org/ab-testing/sample-size.html#!15;95;5;2;0
- Alternatively, if we want to see at least a 5% increase in CTR (going from baseline of 15% to >15.75%), then it's 59,455 per group http://www.evanmiller.org/ab-testing/sample-size.html#!15;95;5;5;1
If you're talking about the test that I think you're talking about, our population of interest is a small subset of the overall population (very few queries actually fit the criteria for even inclusion in some of the tests we do). I recall a low sampling rate for SearchSatisfaction schema, and given that few even fit the test's criteria, it seems we won't have enough people. Which, on the upside, means we shouldn't see problems on the dashboards since the % of people affected will be tiny.
Also, I'm making a task for me to review and implement Bayesian categorical data analysis methods. I should note that the Bayesian approach is still slightly dependent on sample size but less in a "we lack the power to detect effects with too little data, meanwhile too much data yields statistical significance despite tiny observed difference" way and more in a "not enough data makes the results too influenced by the choice of prior distribution(s), so we need to be really careful about our choice of prior(s)" way.
## Get Data # library(RMySQL) # con <- dbConnect(drv = MySQL(), host = "analytics-store.eqiad.wmnet", # dbname = "log", default.file = "/etc/mysql/conf.d/research-client.cnf") # satisfaction_users <- wmf::mysql_read("SELECT date, COUNT(*) AS users_per_day # FROM ( # SELECT # DATE_FORMAT(timestamp, '%Y-%m-%d') AS date, # CONCAT(clientIp, userAgent) AS user_id, # COUNT(*) AS events # FROM TestSearchSatisfaction2_13223897 # GROUP BY date, user_id) AS events_per_user # GROUP BY date;", "log", con) # dbDisconnect(con) # readr::write_csv(satisfaction_users, "~/SearchSatisfactionDaily.csv") satisfaction_users <- readr::read_csv("~/Documents/Data/SearchSatisfactionDaily.csv") library(ggplot2) ggplot(data = satisfaction_users) + geom_line(aes(x = date, y = users_per_day), size = 1.1) + ggtitle("Users captured with SearchSatisfaction schema per day") + scale_y_continuous(name = "Users", breaks = scales::pretty_breaks(n = 10)) + wmf::theme_fivethirtynine()