Page MenuHomePhabricator

Create A/B test experiment for leveling up notifications
Closed, ResolvedPublic5 Estimated Story Points

Description

As part of the xLab exploration we want to setup an A/B test experiment that would present different messaging for the get started notification based on group assignment. The control group will get the old copies introduced in 2023 and the treatment will get the ones from T400118. Target audiences are also updated, see details in parent task.

Acceptance criteria

Open questions

  • How long should the experiment last? — PM will decide when to start/stop the experiment once ready for prod wikis
  • What's the alternate copy for the treatment group — Control group will get the 2023 copies and audiences and treatment the 2025.
  • Is it suitable to use shared instruments for long-lived metrics and experiments?

Event Timeline

Change #1176438 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] [WIP] PoC: setup experiment for Get Started notification

https://gerrit.wikimedia.org/r/1176438

Change #1176500 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] PoC: should different notification based on xLab experiment groups

https://gerrit.wikimedia.org/r/1176500

I've been able to show different notification texts in 1176500. It needs a bit more work and testing but we could consider releasing T400118: Personalized 48 hour notifications for newcomers: Get Started, Re-engage, and Keep Going in the form of an experiment with xLabs if that doesn't delay much the WE1 roadmap, cc @KStoller-WMF

Change #1177978 had a related patch set uploaded (by Hashar; author: Cyndywikime):

[integration/config@master] Zuul: add MetricsPlatform as dependency of GrowthExperiments

https://gerrit.wikimedia.org/r/1177978

Change #1177978 merged by jenkins-bot:

[integration/config@master] Zuul: add MetricsPlatform as dependency of GrowthExperiments

https://gerrit.wikimedia.org/r/1177978

KStoller-WMF moved this task from Inbox to Up Next (estimated tasks) on the Growth-Team board.
KStoller-WMF set the point value for this task to 5.Aug 25 2025, 4:25 PM
Sgs renamed this task from Create A/B test for Get started notification to Create A/B test experiment for leveling up notifications.Aug 26 2025, 2:48 PM

Change #1075286 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] refactor: introduce AbstractExperimentManager

https://gerrit.wikimedia.org/r/1075286

Change #1075285 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] Introduce flag to use MetricsPlatform extension

https://gerrit.wikimedia.org/r/1075285

Change #1184824 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/WikimediaEvents@master] [Poc] ClickThroughRateInstrument: add .create() factory method

https://gerrit.wikimedia.org/r/1184824

Change #1075286 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] refactor: introduce AbstractExperimentManager

https://gerrit.wikimedia.org/r/1075286

I've been discussing with @phuedx on the right steps to test the experiment enrollment for the specific feature of notifications. Here's a summary to clarify the enrollment expectations from Experimentation_Lab/Conduct_an_experiment#Test_and_debug. There are three enrollment authorities that xLab will prompt for user enrollment before MW initialization, the result of each enrollment may or not be overridden by the next authority. Below is a table

Enrollment authorityTarget audienceEnrollment source
EveryoneExperimentsEnrollmentAuthorityanon and logged in usersX-Experiment-Enrollments
LoggedInExperimentsEnrollmentAuthoritylogged in usersCentral ID
OverridesEnrollmentAuthorityanon and logged in usersQuery parameter ?mpo=experiment:group or mw.xLab.overrideExperimentGroup( 'experiment, 'group' );

For the notifications use case, GrowthExperiments schedules the notification at the time of account creation (onLocalUserCreated hook). The user is anon at the time of submitting the create account form and unless a X-Experiments-Enrollments header is added, the first authority to issue a variant will be the "logged in". Then, for testing purposes, we could use the override enrollment authority, however only the cookie/JS mechanism will work since the mpo query parameter won't persist over the auth flow redirections, that should be fixed in T404622.

tl;dr: the most reliable mechanism we have to override the experiment variant for a user to test the new (treatment) notifications, is to call mw.xLab.overrideExperimentGroup( 'growthexperiments-get-started-notification', 'treatment' ).

Change #1075285 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] experiments: introduce ExperimentXLabManager

https://gerrit.wikimedia.org/r/1075285

Change #1176500 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] LevelingUpManager: show different notification based on experiment groups

https://gerrit.wikimedia.org/r/1176500

Change #1176438 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] LevelingUp: setup AB test for notifications

https://gerrit.wikimedia.org/r/1176438

Change #1189521 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/MetricsPlatform@master] LoggedInExperimentsEnrollmentAuthority: make central ID retrieval more reliable

https://gerrit.wikimedia.org/r/1189521

Change #1189522 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/MetricsPlatform@master] ExperimentManager: allow to override an enrollment result

https://gerrit.wikimedia.org/r/1189522

Change #1189527 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] ExperimentXLabManager: allow to re-enroll a user in experiments

https://gerrit.wikimedia.org/r/1189527

Change #1189521 merged by jenkins-bot:

[mediawiki/extensions/MetricsPlatform@master] LoggedInExperimentsEnrollmentAuthority: make central ID retrieval more reliable

https://gerrit.wikimedia.org/r/1189521

Change #1189522 merged by jenkins-bot:

[mediawiki/extensions/MetricsPlatform@master] ExperimentManager: allow to override an enrollment result

https://gerrit.wikimedia.org/r/1189522

Change #1189527 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] ExperimentXLabManager: allow to re-enroll a user in experiments

https://gerrit.wikimedia.org/r/1189527

Change #1190698 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@wmf/1.45.0-wmf.20] ExperimentXLabManager: allow to re-enroll a user in experiments

https://gerrit.wikimedia.org/r/1190698

Change #1190698 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.45.0-wmf.20] ExperimentXLabManager: allow to re-enroll a user in experiments

https://gerrit.wikimedia.org/r/1190698

Mentioned in SAL (#wikimedia-operations) [2025-09-25T07:36:49Z] <jforrester@deploy1003> Started scap sync-world: Backport for [[gerrit:1190698|ExperimentXLabManager: allow to re-enroll a user in experiments (T401308)]]

Mentioned in SAL (#wikimedia-operations) [2025-09-25T07:42:42Z] <jforrester@deploy1003> jforrester, sgimeno: Backport for [[gerrit:1190698|ExperimentXLabManager: allow to re-enroll a user in experiments (T401308)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-09-25T07:54:52Z] <jforrester@deploy1003> Finished scap sync-world: Backport for [[gerrit:1190698|ExperimentXLabManager: allow to re-enroll a user in experiments (T401308)]] (duration: 18m 03s)

Status notes

  • The experiment is live since September 29th 22:30 UTC+2. The experiment automated analytics are available in xLab Experiment Analytics applying the filter Leveling up notifications.
    • Data corruption: the first 48h of the experiment the data are corrupted due to the fact that even if the impressions of the notifications were attributed to users from both control and treatment groups, both user groups were still receiving the old aka control notification.
  • Changes made in Metrics Platform:
    • To make the enrollment of subjects to the notifications experiment possible, GrowthExperiments is overriding the enrollment result produced by xLab in the onBeforeIntialize hook (see source). This is because, in an authentication flow, onBeforeIntialize is called earlier in the request lifecycle than the local user is created and considered registered, so xLab only tries experiment enrollment against EveryoneExperimentsEnrollmentAuthority and OverridesEnrollmentAuthority. As a temporary solution for this problem, GrowthExperiments is re-enrolling users from onLocalUserCreatedHook, where the user object is already considered registered, and overriding the enrollment result into MediaWiki\Extension\MetricsPlatform\XLab\ExperimentManager and wgMetricsPlatformUserExperiments. This means extensions implementing onLocalUserCreatedHook and requesting the user assigned group from there would get different results if they are loaded before or after GrowthExperiments. This is bad. In order to address this issue Growth filed T405074: xLab: Allow user re-enrollment at specific times as a feature request so this use-case can be deeper analyzed by Experimentation platform team.
    • Related to the above, the retrieval of a central id from the onLocalUserCreatedHook had proved problematic in the past (T379682, T380500 ). In 1189521: LoggedInExperimentsEnrollmentAuthority: make central ID retrieval more reliable we changed the id retrieval method for centralIdFromName which is not dependent on a user object correctly attached to a central account.
Edtadros subscribed.

This task’s QA is covered under T400118 (Personalized 48h notifications), which includes the variant experiments (T404085, T401308) and the core delivery flows.

Changing tags because this was covered under SDS2.1.6 GrowthExperiments xLab adoption, so should be considered OKR work not essential work.

Given that I can see results from the A/B test in Superset, I think we can consider this resolved.

The follow up task is: T407431: Growth's "48 hour" newcomer notifications: end A/B test experiment & release changes