Page MenuHomePhabricator

Deploy config change to start the Reference Check A/B Test (en.wiki)
Closed, ResolvedPublic

Description

Deployment timing

Wednesday, 5 November 2025 per plan posted at en.wiki.

Bucketing criteria

Bucketing Requirements:

  • Bucketing should include both registered and unregistered users at all identified "Participating wikis" below.
  • All users, who are editing a desktop or mobile main namespace page (NS:0), at any of the participating wikis should have a 50% chance of being included/bucketed into the A/B test's control or treatment group.
  • Bucketing should be done on a per-Wikipedia basis. 50% of people within a given wiki should be placed within the control group; 50% should be bucketed in the treatment group
  • The test group should have the Reference Check experience enabled while the control group should not
  • People should remain in the same test group for the duration of the test (and across sessions and pages).
NOTE: We need to specify the experiences that will be available in both the test and the control groups. For example, will all wikis in the control group have access to Multi-Check (References)?

Instrumentation-Related Requirements

  • A bucket is applied to these events so we can distinguish all events logged for the control group and the test group within the A/B test.
    • The bucket should be descriptive of the test in case there are overlapping AB tests.
  • The anonymous_user_token field is populated for unregistered users in the test.

Participating wikis

WikiStatusNotes
en.wikiDiscussion ongoingWaiting for consensus

Event Timeline

ppelberg set Due Date to Oct 15 2025, 7:00 AM.Oct 1 2025, 9:42 PM
ppelberg updated the task description. (Show Details)
ppelberg moved this task from Inbox to Ready to Be Worked On on the Editing-team (Kanban Board) board.
ppelberg updated the task description. (Show Details)
ppelberg moved this task from Ready to Be Worked On to Doing on the Editing-team (Kanban Board) board.
ppelberg moved this task from Doing to Ready to Be Worked On on the Editing-team (Kanban Board) board.

Change #1202300 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/VisualEditor@master] Edit check: allow any check to be an a/b test including default ones

https://gerrit.wikimedia.org/r/1202300

Change #1202301 had a related patch set uploaded (by DLynch; author: DLynch):

[operations/mediawiki-config@master] Enable editcheck addReference a/b test on enwiki

https://gerrit.wikimedia.org/r/1202301

Change #1202300 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Edit check: allow any check to be an a/b test including default ones

https://gerrit.wikimedia.org/r/1202300

Change #1202331 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/VisualEditor@wmf/1.46.0-wmf.1] Edit check: allow any check to be an a/b test including default ones

https://gerrit.wikimedia.org/r/1202331

Change #1202301 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable editcheck addReference a/b test on enwiki

https://gerrit.wikimedia.org/r/1202301

Change #1202331 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@wmf/1.46.0-wmf.1] Edit check: allow any check to be an a/b test including default ones

https://gerrit.wikimedia.org/r/1202331

Mentioned in SAL (#wikimedia-operations) [2025-11-06T22:20:49Z] <kemayo@deploy2002> Started scap sync-world: Backport for [[gerrit:1202331|Edit check: allow any check to be an a/b test including default ones (T406134)]], [[gerrit:1202301|Enable editcheck addReference a/b test on enwiki (T406134)]]

Mentioned in SAL (#wikimedia-operations) [2025-11-06T22:24:55Z] <kemayo@deploy2002> kemayo: Backport for [[gerrit:1202331|Edit check: allow any check to be an a/b test including default ones (T406134)]], [[gerrit:1202301|Enable editcheck addReference a/b test on enwiki (T406134)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-11-06T22:34:41Z] <kemayo@deploy2002> Finished scap sync-world: Backport for [[gerrit:1202331|Edit check: allow any check to be an a/b test including default ones (T406134)]], [[gerrit:1202301|Enable editcheck addReference a/b test on enwiki (T406134)]] (duration: 13m 52s)

The a/b test is now enabled on enwiki. It has also been configured by enwiki admins at https://en.wikipedia.org/wiki/MediaWiki:Editcheck-config.json -- currently just to not trigger in the lead section and various other sections.

Ryasmeen moved this task from QA to Ready for Sign Off on the Editing-team (Kanban Board) board.
Ryasmeen subscribed.

Checked the following in en.wiki:

  1. Bucketing is including both registered and unregistered users.
  2. Checked for users who are editing a desktop or mobile main namespace page (NS:0).
  3. The test group has Add Reference Check experience enabled while the control group is not.
  4. People does remain in the same test group for the duration of the test (and across sessions and pages).
  5. Verified that anonymous_user_token field is populated for unregistered users in editAttemptStep on desktop.
ppelberg claimed this task.
ppelberg reassigned this task from ppelberg to MNeisler.

Per offline discussion with Megan today, final step here is to verify buckets are balanced (server-side).

@ppelberg I've verified the buckets are balanced and confirmed that AB test data is logging as expected based on the bucketing requirements defined in the task description.

Summary of Checks

  • The total number of editing sessions and users assigned to each test group are balanced based on a 50/50 split.

Number of Users and Editing Sessions Assigned to Each Bucket

test_groupn_usersn_sessions
2025-09-editcheck-addReference-control190228542644
2025-09-editcheck-addReference-test190672563285
  • The event.bucket field is populated correctly with either 2025-09-editcheck-addReference-test or 2025-09-editcheck-addReference-control to indicate test assignments.
  • AB events are logged on both desktop and mobile web.
  • Both registered and unregistered users are included in the AB test.
  • The anonymous_user_token field is populated for unregistered users in the AB test.
  • Each user is assigned to only one test group.
  • Reference check engagement events are tracked in VEFU (e.g. action-keep and check-shown-presave) and only logged for the test group as expected.

cc @Iflorez