Page MenuHomePhabricator

[A/B Test] Verify Tone Check AB Test buckets are balanced
Closed, ResolvedPublic

Description

Following deployment of the Tone Check AB Test in T387918, we will need to confirm that AB test events are logging as expected and people are being bucketed into the control and test groups in the proportions we expect.

Requirements:

  • Confirm that bucket field is populated.
  • Confirm anonymous_user_token field is populated for unregistered users included in the test.
  • Confirm users are bucketed in the control and test group in expected proportions based on a 50/50 split.

Overview of AB test instrumentation and events:

  • AB test data will be logged in EditAttemptStep.
  • Test assignments are indicated by the event.bucket field.

[To be further populated with bucketing requirements and instrumentation once defined]

Query to confirm bucket balance [can be run on Superset; adjust to reflect dates of deployment and any other bucketing conditions] :

SELECT 
  wiki,
  test_group,
  COUNT(editing_session) AS n_sessions,
  COUNT(user_id) AS n_user_ids
FROM(
SELECT
  event.wiki AS wiki,
  event.editing_session_id AS editing_session,
  IF(event.user_id != 0, concat(cast(event.wiki AS VARCHAR), '-', cast(event.user_id AS VARCHAR)), event.anonymous_user_token) as user_id,
  event.bucket AS test_group,
  COUNT(1) AS n_events
FROM
event.editattemptstep 
WHERE 
-- update based on date AB test was deployed
year = 2025
AND month = 05
AND DAY = 18
-- update based bucket names
AND event.bucket IN ('[INSERT TEST NAME]-control',
'INSERT TEST NAME]-test')
-- update with partner wikis
AND event.wiki IN ('eswiki')
GROUP BY
  event.wiki,
  event.editing_session_id,
  IF(event.user_id != 0, concat(cast(event.wiki AS VARCHAR), '-', cast(event.user_id AS VARCHAR)), event.anonymous_user_token),
  event.bucket
)
GROUP BY 
test_group,
wiki

Event Timeline

MNeisler updated the task description. (Show Details)
MNeisler triaged this task as Medium priority.May 22 2025, 9:24 PM
MNeisler edited projects, added Product-Analytics (Kanban); removed Product-Analytics.

I reviewed the Tone Check AB test events logged in EditAttemptStep since the test was deployed on 3 September 2025 at jawiki, frwiki and ptwiki to confirm that data is logging as expected based on the bucketing requirements defined in T389231.

Summary of Passed Checks

  • We are logging AB test assignments at all partner wikis (jawiki, frwiki, and ptwiki).
  • Bucket assignments are correctly labeled either 2025-09-editcheck-tone-control or 2025-09-editcheck-tone-control
  • The total number of editing sessions and users per test group appear as expected based on 50/50 split.
test_group	                n_users  n_sessions

2025-09-editcheck-tone-control	196166	313282
2025-09-editcheck-tone-test	195754	307945
  • Buckets also appear balanced for each wiki.
  • Both registered and unregistered users are included in the experiment.
  • The anonymous_user_token field is populated for all unregistered users in the AB test.
  • Each user is assigned to only one test group.
  • AB test assignments are recorded for both mobile and desktop editing sessions.
NOTE: There is an issue (T403745) that is currently impacting the logging of tone check engagement events for users in the AB test. As a result, we are not currently logging any events where feature = editCheck-tone in VisualEditorFeatureUse besides the save-before-check-finalized event. I'll plan to run a quick re-check of these events after a fix has been implemented to confirm that all AB test-related events are being logged correctly.

cc @ppelberg

@MNeisler

I reviewed the Tone Check AB test events logged in EditAttemptStep since the test was deployed on 3 September 2025 at jawiki, frwiki and ptwiki to confirm that data is logging as expected based on the bucketing requirements defined in T389231.

Excellent

NOTE: There is an issue (T403745) that is currently impacting the logging of tone check engagement events for users in the AB test. As a result, we are not currently logging any events where feature = editCheck-tone in VisualEditorFeatureUse besides the save-before-check-finalized event. I'll plan to run a quick re-check of these events after a fix has been implemented to confirm that all AB test-related events are being logged correctly.

Noted. Let's consider this ticket "Resolved" and verify the issue you named above is addressed through the QA of T403745, per T403745#11159433