The scope of the task is to decide between different alternatives for assigning feature variants to users.
Context: the challenge is the current GrowthExperiments experiment tooling is only set up for newly created accounts (the variant assignment is done in the onLocalUserCreated hook and stores a user property growthexperiments-homepage-variant ). For the CommunityUpdates module experiment we want to target also existing accounts but as things are in GE that is a non-go because of the impact it could cause to the user_properties table, T54777 user_properties table bloat. The ultimate goal is to re-use tooling provided by the Metrics Platform (T373406 Add experiment enrollment functionality to the Metrics Platform extension and T368163 ). However that work is still in progress and might not get in time for the Sprinthackular week.
Option 0: continue making use of the existing variant assignment system in GrowthExperiments which will create user property rows (around 12K see sample bounds T369908: Estimate Community Updates module experiment sample size and Sample Size estimate doc). Before running the experiment the Growth team will get rid of old unnecessary rows of the property 'growthexperiments-homepage-variant' in the experiment target wikis arwikiand eswiki. Removing all “control” rows and setting this value as the user default should free a much higher number of rows than the experiment will create. See eswiki numbers:
wikiadmin2023@10.64.0.47(eswiki)> select count(*) from user_properties where up_property='growthexperiments-homepage-variant'; +----------+ | count(*) | +----------+ | 544967 | +----------+ 1 row in set (0.562 sec) wikiadmin2023@10.64.0.47(eswiki)> select count(*) from user_properties where up_property='growthexperiments-homepage-variant' and up_value='control'; +----------+ | count(*) | +----------+ | 464608 | +----------+ 1 row in set (13.518 sec)
Option 1: make use of $wgConditionalUserOptions and adding a new condition like CUDCOND_BUCKET, which would statically and randomly divide users into variants based on their ID (user would be considered to be within variant Y if user_id % X == Y, where X and Y are parameters). This would prevent new rows from being created unless the value is changed for a user after being assigned.
Option 2: avoid storing anything in the user property table by assigning feature variants with the tooling provided in T373406: Add experiment enrollment functionality to the Metrics Platform extension. Similar to Option 1 the user_id is used to create a uniform distributed hash and transformed into a number used for bucketing. The trade-off of not storing the variant assigned is it needs to be re-computed on each page render. However the computational cost is cheap.