Page MenuHomePhabricator

Make config change to start New Discussion Tool A/B Test
Closed, ResolvedPublic

Description

This task is about enabling/turning on the New Discussion Tool A/B test at the ===Participating wikis (listed below).

Deployment timing

Deployment date: 27 January 2022

❗️Editing Engineering: once this patch is merged, can you please ping @MNeisler in a comment and assign this task over to her to verify people are being bucketed in the ways we defined in T291307.

Participating wikis

This section will be consider finalized once T291306 is resolved.

IDWikiCode
1.Amharic Wikipediaamwiki
2.Bengali Wikipediabnwiki
3.Chinese Wikipediazhwiki
4.Dutch Wikipedianlwiki
5.Egyptian Wikipediaarzwiki
6.French Wikipediafrwiki
7.Hebrew Wikipediahewiki
8.Hindi Wikipediahiwiki
9.Indonesia Wikipediaidwiki
10.Italian Wikipediaitwiki
11.Japanese Wikipediajawiki
12.Korean Wikipediakowiki
13.Oromo Wikipediaomwiki
14.Persian Wikipediafawiki
15.Polish Wikipediaplwiki
16.Portuguese Wikipediaptwiki
17.Spanish Wikipediaeswiki
18.Thai Wikipediathwiki
19.Ukrainian Wikipediaukwiki
20.Vietnamese Wikipediaviwiki

Testing instructions

Done

  • All ===Open questions are answered
  • Define ===Deployment timing
  • @MNeisler Verify people at the ===Participating wikis are being bucketed in the ways we defined in T291307
  • Verify the New Discussion Tool is working as expected

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenMNeisler
ResolvedMNeisler
OpenNone
OpenNone
Resolved ppelberg
Resolvedmpopov
ResolvedDLynch
ResolvedWhatamidoing-WMF
ResolvedRyasmeen
Resolved ppelberg
Resolved ppelberg
ResolvedDLynch
Resolved ppelberg
ResolvedDLynch
Resolved ppelberg
Resolved ppelberg

Event Timeline

I've updated the task description with the question @DLynch + @Whatamidoing-WMF surfaced in yesterday's team meeting:

=== Open questions

  • 1. What – if any – impact would starting the A/B test on different wikis have on: A) the test results and B) the effort required to compute the test results?
ppelberg added a subscriber: MNeisler.

Note: I've updated the ===Deployment timing section of this task based on the conversation @MNeisler and I had today (17-November).

ppelberg updated the task description. (Show Details)
ppelberg set Due Date to Dec 16 2021, 6:00 PM.

Meta
I've added the provisional deployment date (Thursday, 16 December) to the task description.

A/B Start Date
As discussing during today's team meeting, we plan for the A/B test of the New Discussion Tool to start today, 27-Jan-2022.

I've updated the task description to reflect the above.

Change 757730 had a related patch set uploaded (by DLynch; author: DLynch):

[operations/mediawiki-config@master] Launch DiscussionTools new topic tool a/b test

https://gerrit.wikimedia.org/r/757730

Change 757730 merged by jenkins-bot:

[operations/mediawiki-config@master] Launch DiscussionTools new topic tool a/b test

https://gerrit.wikimedia.org/r/757730

Mentioned in SAL (#wikimedia-operations) [2022-01-27T19:38:52Z] <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: 2c8561c1c0aa6b4f5f8202972b7b28723337e88e: Launch DiscussionTools new topic tool a/b test (T291308) (duration: 00m 51s)

@MNeisler Verify people at the ===Participating wikis are being bucketed in the ways we defined in T291307

@MNeisler: I'm assigning this task over to you to do the above now that the A/B test has begun!

MNeisler triaged this task as High priority.Feb 1 2022, 2:40 PM
MNeisler moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

Logged-in Users (Bucketing Verified)
I confirmed that logged-in users are being bucketed as expected on each of the participating wikis.

This was verified by checking the user_properties table for users where the discussiontools-abtest2 property was assigned. Since the start of the AB tet to date (27 Jan 2021 to 1 Feb 2021), a total of 124,701 logged-in users have been bucketed in the AB test. There is an expected 50/50 split for each group (test and control) on each of the participating wikis.

Total Number of Bucketed Logged-In Users% Control% Test
12470162718 (50.3%)61983 (49.7%)

I also reviewed edit attempts in the test and user buckets logged on EditAttemptStep. For logged-in users, the number of users that have made an edit attempt in the test and user buckets appear as expected based on a 50/50 split.

New Discussion Tool AB Test edit attempts by logged-in users across all participating wikis

experiment groupusersattempts
control300615033
test363315468

Logged-Out Users (Bucketing needs investigation)
Anonymous users are not included in the user_properties table so we are not able to verify bucketing there.

Instead, I checked the anonymous_user_token field in EditAttemptStep to see if the number of attempts recorded by logged-out users in each test was expected based on a 50/50 split. Only 19% of all logged-out users recorded as making an edit attempt in the New Discussion Tool AB test were bucketed in the control group across all participating wikis.

While this data is limited to bucketed users that made an edit attempt, it seems lower than I would expect if 50% of logged-out users are supposed to be bucketed into the control group.

New Discussion Tool AB Test edit attempts by logged-out users across all participating wikis

experiment groupusersattempts
control377555
test16272156

@DLynch - Any ideas for why this discrepancy might exist? Let me know if there are any other data breakdowns that would be helpful to review or other methods besides the anonymous_user_token in EditAttemptStep for me to check if anon users were bucketed correctly.

(cc @ppelberg)

See notebook for queries and other details.

Change 759333 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/WikiEditor@master] New bucket for abtest data

https://gerrit.wikimedia.org/r/759333

Change 759334 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/VisualEditor@master] New bucket for abtest data

https://gerrit.wikimedia.org/r/759334

Change 759334 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] New bucket for abtest data

https://gerrit.wikimedia.org/r/759334

Change 759333 merged by jenkins-bot:

[mediawiki/extensions/WikiEditor@master] New bucket for abtest data

https://gerrit.wikimedia.org/r/759333

Change 759315 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/VisualEditor@wmf/1.38.0-wmf.20] New bucket for abtest data

https://gerrit.wikimedia.org/r/759315

Change 759316 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/WikiEditor@wmf/1.38.0-wmf.20] New bucket for abtest data

https://gerrit.wikimedia.org/r/759316

Context: VE and WikiEditor weren't logging from the new bucket-preference, and logged-out users in those editors were entirely logging from the user-preference which would never be set. Events from DiscussionTools were reliable, however. Patches update the logging to use the global-config wgDiscussionToolsABTestBucket which is reliably set.

Consideration that hadn't been stated before: logged-out users won't have their bucket/id logged for EditAttemptStep init/save* events from WikiEditor, because those happen server-side where this information isn't available. Comparisons including logged-out users will need to either aggregate data across the entire editing-session, or will need to restrict themselves to other events like ready which are sent from client-side.

Change 759315 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@wmf/1.38.0-wmf.20] New bucket for abtest data

https://gerrit.wikimedia.org/r/759315

Change 759316 merged by jenkins-bot:

[mediawiki/extensions/WikiEditor@wmf/1.38.0-wmf.20] New bucket for abtest data

https://gerrit.wikimedia.org/r/759316

Mentioned in SAL (#wikimedia-operations) [2022-02-03T19:27:30Z] <taavi@deploy1002> Synchronized php-1.38.0-wmf.20/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js: Backport: [[gerrit:759315|New bucket for abtest data (T291308)]] (duration: 00m 50s)

Mentioned in SAL (#wikimedia-operations) [2022-02-03T19:28:29Z] <taavi@deploy1002> Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/includes/Hooks.php: Backport: [[gerrit:759316|New bucket for abtest data (T291308)]] (1/2) (duration: 00m 49s)

Mentioned in SAL (#wikimedia-operations) [2022-02-03T19:29:23Z] <taavi@deploy1002> Synchronized php-1.38.0-wmf.20/extensions/WikiEditor/modules/ext.wikiEditor.js: Backport: [[gerrit:759316|New bucket for abtest data (T291308)]] (2/2) (duration: 00m 50s)

@DLynch
I rechecked the bucketing for logged-out users following the deployment of the last patch on 3 February 2021 and the data still seems off (with fewer users in the control group than expected). See details below:

Here is a breakdown of distinct logged-out users (identified by distinct anon token) and edit attempts in the AB test looking at ready events on talk pages (logged since 3 February 2021)

experiment_groupusersattempts
control2501203
test11502295

While investigating the source of the imbalance, I noticed that we are not logging an anonymous_user_token for ready events where event.integration = 'page' and editor_interface = 'wikitext' so these attempts are all counted as 1 user. Here's a breakown of experiment groups by integration, interface, and whether the user has anon token assigned.

experiment_groupintegrationuser_is_assigned_anon_tokeninterfaceusersattempts
controldiscussiontoolstruevisualeditor152183
controldiscussiontoolstruewikitext11
controldiscussiontoolstruewikitext-201798113
controlpagefalsewikitext1906
testdiscussiontoolstruevisualeditor827994
testdiscussiontoolstruewikitext127143
testdiscussiontoolstruewikitext-2017214316
testpagefalsewikitext1842

Here's the query I used to collect the data above:

SELECT
  event.editing_session_id as session_id,
  wiki As wiki,
  event.bucket AS experiment_group,
  event.editor_interface as interface,
  event.integration as integration,
  event.anonymous_user_token as anon_token,
-- check to make sure all anons have token assigned
  if(event.anonymous_user_token is NULL, false, true) as user_is_anonymous_bytoken, 
  event.platform as platform
FROM event.editattemptstep
WHERE
-- since deployment of patch
  Year = 2022
  AND (month = 02 and day >= 03) 
  -- remove bots
  AND useragent.is_bot = false
-- only anon user
  AND event.user_id = 0 
AND event.user_class = 'IP'
-- only test events
  AND event.bucket in ('test', 'control')
-- only desktop talk pages
  AND event.page_ns % 2 = 1
  AND event.platform = 'desktop'
-- need to check bucketing on ready action as WikiEditor's server-side logging doesn't have access to the bucket or anonymous user ID for them
  AND event.action = 'ready'
-- partcipating wikis
  AND wiki IN ('amwiki', 'bnwiki', 'zhwiki', 'nlwiki', 'arzwiki', 'frwiki', 'hewiki', 'hiwiki',
    'idwiki', 'itwiki', 'jawiki', 'kowiki', 'omwiki', 'fawiki', 'plwiki', 'ptwiki', 'eswiki', 'thwiki',
    'ukwiki', 'viwiki')

Here are the results of logged-out sessions in the AB test where init_type = 'section':

experiment_groupintegrationuser_is_anonymous_bytokeneditor_interfaceusersattempts
controldiscussiontoolstruewikitext11
controlpagefalsewikitext179
testdiscussiontoolstruevisualeditor814954
testdiscussiontoolstruewikitext157175
testdiscussiontoolstruewikitext-2017171226
testpagefalsewikitext164
WITH new_section_events AS (

SELECT
  event.editing_session_id as edit_attempt_id,
  wiki As init_wiki
FROM
  event.editattemptstep
WHERE
-- following deployment of patch
    YEAR = 2022
    AND month = 02
    AND day >= 03
    AND event.platform = 'desktop'
    AND event.action = 'init'
-- only talk pages
    AND event.page_ns % 2 = 1
--by anon not bot users
    AND useragent.is_bot = false
    AND event.user_id = 0 
   AND event.user_class = 'IP'
    AND event.init_type = 'section'
--- test wikis
    AND wiki IN ('amwiki', 'bnwiki', 'zhwiki', 'nlwiki', 'arzwiki', 'frwiki', 'hewiki', 'hiwiki',
    'idwiki', 'itwiki', 'jawiki', 'kowiki', 'omwiki', 'fawiki', 'plwiki', 'ptwiki', 'eswiki', 'thwiki',
    'ukwiki', 'viwiki')
)

SELECT
  event.editing_session_id as edit_attempt_id,
  event.bucket AS experiment_group,
  wiki As ready_wiki,
  event.integration AS integration,
  event.anonymous_user_token as user_id,
  if(event.anonymous_user_token is NULL, false, true) as user_is_anonymous_bytoken, 
  event.is_oversample AS is_oversample,
  event.editor_interface AS editor_interface
FROM event.editattemptstep eas
INNER JOIN new_section_events
ON eas.event.editing_session_id = new_section_events.edit_attempt_id
AND eas.wiki = new_section_events.init_wiki
WHERE
  YEAR = 2022
    AND month = 02
    AND day >= 03
-- look at only desktop ready events
  AND event.platform = 'desktop'
  AND event.action = 'ready'
--by anon not bot users
    AND useragent.is_bot = false
    AND event.user_id = 0 
   AND event.user_class = 'IP'
-- only talk page events
  AND event.page_ns % 2 = 1
-- bucketing applied on ready events
AND event.bucket in ('test', 'control')
AND wiki IN ('amwiki', 'bnwiki', 'zhwiki', 'nlwiki', 'arzwiki', 'frwiki', 'hewiki', 'hiwiki',
    'idwiki', 'itwiki', 'jawiki', 'kowiki', 'omwiki', 'fawiki', 'plwiki', 'ptwiki', 'eswiki', 'thwiki',
    'ukwiki', 'viwiki')

Filling in from what was said in a chat elsewhere:

  • Users aren't getting their bucket added to ready events when using full-page wikitext editing (because DT sets that up, and it's explicitly blocked from trying to load in that scenario) -- I'll be amending our logging to include that data
  • I'm confused by the situation where a logged out user could be logging a bucket but not an anonymous id, and will try to work out what causes that. (It's notably low numbers, so it's presumably situational.)

Documenting the actions to address the issues @DLynch and @MNeisler identified in this ticket:

  • T301495: Investigate what init type is associated with the sessions where a ready event is being emitted, but an anonymous user token is not being logged
  • T301496: Add distinct event to talk_page_edit for when a new section is added
  • T301497: Make sure the test bucket is logged when a logged out user edits a full page in wikitext
  • T301499: Investigate what's going on when a user's test bucket is being logged, but not their user ID