Page MenuHomePhabricator

Automated Analytics - store sample variances in experiment results
Closed, ResolvedPublic3 Estimated Story Points

Description

Right now only the sample means are stored (for control and treatment groups). This is OK for proportion-type metric where the variance is just mean * (1 - mean) but for mean-type metrics where the sample variance can only be calculated from underlying data, this does not work.

We need sample variances for planning future experiments and doing power analysis, which depends on baseline and the baseline's variance. Crucially, baseline measurements might not be available outside of experiments (especially if the metric requires edge unique-based units).

NOTE: It is technically possible to get the sample variance back using the quantities store (by mathing backwards from the confidence interval bounds that we do store), but this would be substantially easier.

Acceptance criteria

  • wmf_experiments.experiment_results_v1 has a new sample_variances struct column with control and treatment properties, just like sample_means column
  • Analytics job stores the sample variances in this column
    • Source code updated
    • Conda artifact generated and deployed
    • Airflow DAG updated to use new artifact

Notes

ALTER TABLE wmf_experiments.experiment_results_v1
ADD COLUMN sample_variances struct<control: double, treatment: double> AFTER sample_means;

UPDATE wmf_experiments.experiment_results_v1
SET sample_variances = STRUCT(NULL, NULL) WHERE sample_variances IS NULL;

Event Timeline

mpopov triaged this task as Medium priority.Sep 17 2025, 3:10 PM
mpopov set the point value for this task to 3.

Updated table in preparation for the next run of the experiment analysis job

> DESCRIBE wmf_experiments.experiment_results_v1;
col_name	data_type	comment
dt	timestamp	When the experiment results were generated
experiment_name	string	Experiment owner-provided name of experiment
experiment_machine_name	string	Machine-readable experiment name, matchable with experiment_configs table
metric_name	string	
metric_type	string	Type of metric, used in reporting and formatting. Values: "Mean" or "Proportion"
difference_type	string	Difference types. Values: "Relative" or "Absolute"
user_auth_segment	string	For user segmentation by authentication status. Values: "Everyone", "Logged-in only", "Logged-out only"
sample_sizes	struct<control:bigint,treatment:bigint>	
sample_means	struct<control:double,treatment:double>	
sample_variances	struct<control:double,treatment:double>	
health_results	struct<p_value:double>	
bayesian_results	struct<estimate:double,variance:double,chance_to_win:double,loss:double,risk:double,credible_interval_95:struct<lower:double,upper:double>>	
frequentist_results	struct<estimate:double,standard_error:double,statistic:double,degrees_of_freedom:double,p_value:double,confidence_interval_95:struct<lower:double,upper:double>>	
		
# Partitioning		
Not partitioned

Started getting errors in Superset:

Database error
Failed to execute query 'SELECT *
FROM experiment_results_v1': 'NoneType' object is not iterable

After the grouping toggle experiment was re-analyzed with the updated job, only that experiment's results were OK.

Comparing empty values for the grouping toggle experiment to the empty values of other experiments reveals the problem:

select experiment_machine_name, sample_variances
from experiment_results_v1
where experiment_machine_name in ('fy24-25-we-1-7-rc-grouping-toggle', 'we-3-2-3-donate-ab-test-1')
order by experiment_machine_name;
experiment_machine_name	sample_variances
fy24-25-we-1-7-rc-grouping-toggle	{"control":null,"treatment":null}
we-3-2-3-donate-ab-test-1	NULL

As bearloga:

USE bearloga;
CREATE TABLE experiment_results_v1_backup
USING ICEBERG
TBLPROPERTIES (
  'format-version' = '2',
  'write.delete.mode' = 'copy-on-write',
  'write.parquet.compression-codec' = 'zstd'
)
AS SELECT * FROM wmf_experiments.experiment_results_v1;

-- To test:
-- UPDATE experiment_results_v1_backup
-- SET sample_variances = STRUCT(NULL, NULL) WHERE sample_variances IS NULL;

As analytics-product:

To fix:

UPDATE wmf_experiments.experiment_results_v1
SET sample_variances = STRUCT(NULL, NULL) WHERE sample_variances IS NULL;

Resolved!


Scratch notes

Setup

USE bearloga;

CREATE TABLE update_test (
  `first_first` STRING,
  `last_name` STRING
)
USING ICEBERG
TBLPROPERTIES (
  'format-version' = '2',
  'write.delete.mode' = 'copy-on-write',
  'write.parquet.compression-codec' = 'zstd'
);

INSERT INTO update_test VALUES ('Peter', 'Parker'), ('Bruce', 'Wayne');

ALTER TABLE update_test
ADD COLUMN superhero struct<alias: string, powers: string>;

INSERT INTO update_test VALUES ('James', 'Howlett', NAMED_STRUCT('alias', 'Wolverine', 'powers', 'regeneration'));

Testing:

SELECT * FROM bearloga.update_test
Database error
Failed to execute query 'SELECT *
FROM bearloga.update_test
': 'NoneType' object is not iterable

Because the actual result set is:

first_first	last_name	superhero
Peter	Parker	NULL
Bruce	Wayne	NULL
James	Howlett	{"alias":"Wolverine","powers":"regeneration"}

Testing the fix

Okay, let's try updating to a struct of nulls:

UPDATE update_test
SET superhero = STRUCT(NULL, NULL) WHERE superhero IS NULL;

SELECT * gives us:

first_first	last_name	superhero
James	Howlett	{"alias":"Wolverine","powers":"regeneration"}
Peter	Parker	{"alias":null,"powers":null}
Bruce	Wayne	{"alias":null,"powers":null}

And Superset/SQL Lab gives us SUCCESS!

Cleanup

DROP TABLE update_test;
hdfs dfs -rm -R /user/hive/warehouse/bearloga.db/update_test