Document desired properties of an enrollment sampling algorithm
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	nettrom_WMF
	Aug 8 2024, 8:57 PM

Description

In a meeting, @phuedx and @nettrom_WMF met to discuss approaches to sampling users for experiment enrollment. We reviewed the GrowthExperiments approach as well as PageSplitterInstrumentation.php and mediawiki.experiments.js

From our conversation, we drafted the following preferred properties:

Does not require a backend store. Example: GrowthExperiments stores group assignment in the user_properties table.
Can sample on a variety of levels such as page, session, user.
Will sample consistently if given the same starting value (e.g. if we're sampling on page ID, the same page ID will always return the same assignment).
Scales to sample when needed. For example, we can sample when a user visits a specific page and thereby not assign groups to users who never visited that page.

There might be aspects related to this that we did not capture, or where the descriptions can be improved. We're therefore looking for @mpopov to review and provide input.

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Resolved		Sgs	T374471 Decide which bucketing/variant assignment system should we use
		Resolved		mpopov	T372108 Document desired properties of an enrollment sampling algorithm

Event Timeline

nettrom_WMF created this task.Aug 8 2024, 8:57 PM

mpopov triaged this task as Medium priority.Aug 8 2024, 9:06 PM

mpopov edited projects, added Product-Analytics (Kanban); removed Product-Analytics.

mpopov moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

Will sample consistently if given the same starting value (e.g. if we're sampling on page ID, the same page ID will always return the same assignment).

I want to note that we want to ensure consistency of assignment within experiment. For example, if I remember right in the past some teams have run experiments where they assign users to control/treatment based on whether the user ID in the MW database is odd or even.

If we run 2 experiments sampled on page ID and both having 2 groups (control and treatment), the same page ID should return the same assignment per experiment, so when the determination happens these should be equally likely (assuming equal sampling rates):

experiment 1	experiment 2
control	control
control	treatment
treatment	control
treatment	treatment

NOT

experiment 1	experiment 2
control	control
treatment	treatment

Can sample on a variety of levels such as page, session, user.

Scales to sample when needed. For example, we can sample when a user visits a specific page and thereby not assign groups to users who never visited that page.

Does not require a backend store. Example: GrowthExperiments stores group assignment in the user_properties table.

Essentially the algorithm must operate on some identifier to determine group assignment rather than retrieving the assignment from an external source. Makes sense, but if we're making the determination on the starting value every time we need then we can't ever modify experiment settings for the duration of the experiment.

Suppose an editor is enrolled in an experiment with the following settings:

15% of editors who use the feature
50/50 split between variant A and variant B

The editor gets assigned to variant B and we store this assignment in session cookie for easy reference (rather than calculating it fresh on every page load).

The editor closes the browser and starts a new session. While they were away, we modified the experiment's settings to:

10% of editors who use the feature
60/40 split between variant A and variant B

The editor did not use their browser's "restore previous session" feature so their session cookies were cleared out. We now need to make the determination again. Uh oh.

First, the probability that this editor is going to stay enrolled in the experiment has decreased. Second, if the editor is once again enrolled in the experiment they are now more likely to be assigned to variant A.

So the only way that we ensure consistency of assignment output sans backend store is by locking all the inputs.

mpopov moved this task from Doing to Needs Review on the Product-Analytics (Kanban) board.Aug 16 2024, 7:40 PM

VirginiaPoundstone moved this task from Incoming to Data products Sprint 18 on the Experimentation Lab board.Aug 19 2024, 3:31 PM

VirginiaPoundstone edited projects, added Experimentation Lab (Data products Sprint 18); removed Experimentation Lab.

VirginiaPoundstone moved this task from Sprint Backlog to Radar on the Experimentation Lab (Data products Sprint 18) board.

• WDoranWMF moved this task from Radar to Code Review / Tech Input on the Experimentation Lab (Data products Sprint 18) board.Aug 19 2024, 4:30 PM

So the only way that we ensure consistency of assignment output sans backend store is by locking all the inputs.

I think this is implicit in you comment but I want to make it explicit:

Let's assume that we have a backing store that can handle one row per session (~200M rows) with a lookup per pageview (~6000/s) so that we can store all assignments for all users. We're still limited to storing a logged-out user's token (session or otherwise) on the client, be it in a cookie, sessionStorage, or localStorage. If the chosen store is cleared out for any reason, then they are a effectively a new user with a new assigment.

So the only way that we ensure consistency of assignment output sans backend store is by locking all the inputs.

Yes. This is also true if you have no control over the lifetime of the user token.

• WDoranWMF moved this task from Code Review / Tech Input to Paused on the Experimentation Lab (Data products Sprint 18) board.Aug 28 2024, 4:10 PM

VirginiaPoundstone added a project: WMF-SDS 2 Sprinthackular 2024.Sep 5 2024, 5:35 PM

VirginiaPoundstone edited projects, added Experimentation Lab (Data Products Sprint 19); removed Experimentation Lab (Data products Sprint 18).Sep 5 2024, 7:05 PM

VirginiaPoundstone moved this task from Sprint Backlog to Paused on the Experimentation Lab (Data Products Sprint 19) board.

phuedx moved this task from Backlog to In Process on the WMF-SDS 2 Sprinthackular 2024 board.Sep 10 2024, 4:15 PM

phuedx added a parent task: T374471: Decide which bucketing/variant assignment system should we use.Sep 11 2024, 9:11 AM

Just checked with @phuedx and we're aligned on the terminology:

By "inputs" we both mean the inputs/parameters going into the algorithm/function that deterministically outputs a decision. Those inputs include a token and configuration variables like sampling rate.

By "locking" we both mean not allowing those inputs to change to the best of our ability – that is, preventing modification of the experiment's configuration because otherwise if given the same token but different sampling rate it would result in a different decision than before.

@phuedx asked "Is locking all of the inputs acceptable?"

Yes, and in absence of a memory it is also necessary.

I think the only way we could allow modifying an in-progress experiment is if we maintained a record of decisions (whether a given a token was selected to be in the experiment and which of the experimental group it was assigned to).

Without maintaining such a record, we want to ensure that we consistently make the same decision for the same token. This requires keeping all of the variables going into producing a decision to be locked.

Therefore, it should only be possible to modify an experiment's sampling rate before it has started. (But extending an in-progress experiment's end date should be okay, as it would not be an input.)

I've tried to capture our discussion in https://wikitech.wikimedia.org/w/index.php?title=Metrics_Platform%2FSampling#Experiment_Enrolment_Sampling (the page was moved from Sampling Units). Please review the section and be bold if you spot a mistake.

phuedx moved this task from In Process to Needs Code Review on the WMF-SDS 2 Sprinthackular 2024 board.Sep 13 2024, 9:58 AM

VirginiaPoundstone moved this task from Needs Code Review to Ready for Sign Off on the WMF-SDS 2 Sprinthackular 2024 board.Sep 13 2024, 4:09 PM

I've taken a look at it and couldn't find anything to add or change. Moving it to "done". Maybe @mpopov wants to have the pleasure of closing this?

VirginiaPoundstone moved this task from Paused to Done on the Experimentation Lab (Data Products Sprint 19) board.Sep 18 2024, 4:06 PM

Looks great, thank you @phuedx!

Document desired properties of an enrollment sampling algorithmClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Document desired properties of an enrollment sampling algorithm
Closed, ResolvedPublic
Actions

Related Objects
Search...