Page MenuHomePhabricator

Run A/B test to evaluate impact of New Topic Tool
Closed, ResolvedPublic13 Estimated Story Points

Description

This task is about conducting an A/B test to help us understand what impact the New Discussion Tool is having on Junior Contributors' likelihood to start (activation) and continue (retention) participating on Wikipedia talk pages.

Test timing

  • Start date: TBD
  • End date: ~4 weeks after "Start date" per T290204#7356556

Participating wikis

This list of participating wikis is drawn from T291306.

Candidate wikis

IDWikiCode
1.Amharic Wikipediaamwiki
2.Bengali Wikipediabnwiki
3.Chinese Wikipediazhwiki
4.Dutch Wikipedianlwiki
5.Egyptian Wikipediaarzwiki
6.French Wikipediafrwiki
7.Hebrew Wikipediahewiki
8.Hindi Wikipediahiwiki
9.Indonesia Wikipediaidwiki
10.Italian Wikipediaitwiki
11.Japanese Wikipediajawiki
12.Korean Wikipediakowiki
13.Oromo Wikipediaomwiki
14.Persian Wikipediafawiki
15.Polish Wikipediaplwiki
16.Portuguese Wikipediaptwiki
17.Spanish Wikipediaeswiki
18.Thai Wikipediathwiki
19.Ukrainian Wikipediaukwiki
20.Vietnamese Wikipediaviwiki

Decision to be made

The decision this analysis is intended to help us make:
Should the New Discussion Tool be offered to all people, at all wikis, as an opt-out user preference?

Hypotheses

To help evaluate the impact of the New Discussion Tool, we will analyze whether adding a more intuitive workflow for starting new discussions on Wikipedia talk pages:

IDHypothesisMetric(s) for evaluation
KPI...causes a greater percentage of Junior Contributors to publish the new discussions they start without a significant increase in disruption. (see "Guardrail" below)New discussion completion rate as defined by the number of people who click the Add topic / New section link (action = init), what % of people successfully publish at least one new discussion topic they were drafting (action = saveSuccess). What is the impact of experience level on this KPI? Note that this does not take into account the number of attempts it took for the user to publish or the duration of their editing sessions.
Guardrail...does not cause a significant increase in the percent of disruptive edits being made to talk pagesThe percent of discussion topics added to talk pages that are reverted within 48 hours. The percent of editors who are blocked after adding a discussion topic to a talk page.
Curiosity #1...causes a greater number of Junior Contributors to start participating productively on talk pages.The number of distinct Junior Contributors who make at least one edit to a page in a talk namespace that is not reverted within 48 hours.
Curiosity #2...causes a greater percentage of Junior Contributors continue participating productively on talk pages.The percentage of Junior Contributors who make at least one edit to a page in a talk namespace that is not reverted within 48 hours in each of the following time intervals: 2 to 7 days after making their edit (read: within the first week), 8 to 14 days after making their first edit (read: within the second week), and 15 to 30 days after making their first edit (read: within the third or fourth weeks).

Decision matrix

IDScenarioPlan of action
1.People are more likely to publish edits using the New Discussion Tool than they are using the existing section=new interfaceContinue with plans to make the New Discussion Tool available at all Wikipedias, by default. See T275256 for more detail.
2.People are less likely to publish edits using the New Discussion Tool than they are using the existing section=new workflow.Investigate where within the New Discussion Tool workflow people are dropping off and hypothesize what could be contributing to this drop-off. In parallel, we will pause plans to make the New Discussion Tool available at all Wikipedias, by default. See: T263054#7363191.
3.People are as likely to publish edits using the New Discussion Tool as they are using the existing section=new workflow.Continue with plans to offer the New Discussion Tool as an opt-out preference at all Wikipedias considering qualitative feedback suggests the tool is leading people to find participating on talk pages easier / more efficient.

Done


Notes

  • When analyzing data from the A/B test, we want to remember:
    • That fa.wiki and pt.wiki have stopped allowing IPs from editing. More context in: T291306#7553319.
    • To note if/when during the desktop New Discussion Tool A/B test the New Discussion Tool became available by default on mobile at the first group of wikis (T282638)
  • Per the observation @MNeisler made in Slack, the way that our A/B test infrastructure is currently set up, it is possible for the following to happen:
    • 1. A logged in user ("Person 1") is assigned a test bucket (e.g. test or control)
    • 2. "Person 1" logs out
    • 3. "Person 1" revisits the Wikipedia where the A/B test is being run
    • 4. "Person 1" becomes bucketed in the A/B test for a second time which means the A/B test "thinks" that in "Step 3." someone new has entered the test when in reality it is the same person being bucketed twice

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ppelberg renamed this task from Analyze the impact of the New Discussion Tool to Run A/B test to evaluate impact of New Discussion Tool.Sep 17 2021, 10:38 PM
ppelberg updated the task description. (Show Details)
ppelberg updated the task description. (Show Details)
LZaman set the point value for this task to 13.Sep 21 2021, 9:19 PM
LZaman triaged this task as Medium priority.Sep 21 2021, 9:25 PM
VPuffetMichel raised the priority of this task from Medium to High.Nov 8 2021, 1:35 PM
ppelberg renamed this task from Run A/B test to evaluate impact of New Discussion Tool to Run A/B test to evaluate impact of New Topic Tool.Apr 8 2022, 6:32 PM

@ppelberg
For review, below is a summary of the results from the New Topic Tool AB Test analysis for logged-in users. Please see further details in the report. Let me know if you have any questions or suggested changes.

Note: Logged-Out analysis was completed separately as it required a different methodology to gather and analyze results. Results will be posted soon.

Logged-In Users New Topic AB Test Analysis

New Topic Completion Rate

new_topic_attempts_jc_all.png (2×4 px, 164 KB)

  • Overall, there was a 7 percentage point (37.2% → 44.2%); 19% observed increase in the percent of Junior contributors that were able to successfully publish at least 1 new topic with the new topic tool.
  • Trends vary on a per wikipedia basis.
  • While the majority of participating Wikipedia saw a higher new topic completion rate with the new topic tool, 6 of the participating Wikipedias (Hebrew, Korean, Persian, Polish, Ukrainian, Vietnamese) saw a slightly higher new topic completion rates with the previous new section link editing workflow. There were no significant differences between the two editing methods for any particular wiki.
  • We modeled the impact to correctly infer the impact of the new topic tool and account for the effects by the user and wiki. Based on estimates from the model, we found that Junior Contributors who open the new topic tool are about 1.3 times more likely to successfully publish a new topic than Junior Contributors who open the previous add new section link workflow.

new_topic_completes_byexp (1).png (2×4 px, 165 KB)

  • When comparing Junior Contributor new topic completion rate to Non-Junior Contributors, we see a clear difference between the two experience levels.

Experience still appears to be a significant factor in a user's ability to successfully complete an edit with either editing method; however, the new topic tool helped to slightly decrease the difference in completion rates beween the two experience levels. There was a -33.8 percentage points (71% → 37.2%; 47.6% decrease) between Non-Junior and Junior Contributors new topic completion rates with the previous add new section link compared to -22 percentage points (66.1% → 44.2%; 33.1% decrease) with the new topic tool.

New Topic Revert Rates

new_topic_reverts_jc_all .png (2×4 px, 149 KB)

  • Overall, across all participating Wikipedias, we observed a -2 percentage point (11.1% → 8.96%; 19.3% decrease) in the revert rate for new topic tool edits made by Junior Contributors compared to edits made using the previous add section link. In addition to increasing the likelihood of a Junior Contributor saving a new topic, the new topic tool also appears to reduce the number of errors in the published new topic that might lead to the new topic being reverted.

new_topic_reverts_byexp.png (2×4 px, 176 KB)

  • Similar to our finding for new topic completion rate, experience level has a significant impact on the revert rate for new topics published with either new topic editing method. Comparing the editing method types, the new topic tool had minimal impact on the revert rate for more Senior Contributors (led to a slight increase) but a more significant impact on the revert rate identified for Junior Contributors.

Blocked New Topic Users

new_topic_blocks_byexp.png (2×4 px, 161 KB)

  • Overall, across all participating Wikipedias, there was a -1.8 percentage points (4.05% → 2.22%; 45.2% decrease) in the percent of Junior Contributors blocked after posting a new topic using the new topic tool compared to the Junior Contribuors using the previous add new section link. Only between 1 to 6 users were blocked total for either editing method on each participating Wikipedia.
  • Junior Contributors are more frequently blocked following a the addition of a new topic on a talk page compared to more Senior Contributors; however, use of the new topic tool decreased this difference significantly.

*Junior Contributors that used of the new topic tool were only blocked slightly more than non-Junior contributors using the new topic tool (2.22% compared to 1.89%). In contrast, Junior contributors using the previous add new section link workflow were blocked 125% more (+2.3 percentage points) than Non-Junior contributors using the same method.

Number of Junior Contributors

  • No significant change. The new topic tool did not impact the number of Junior Contributors that completed a new topic. Only 2 more distinct users were recorded as making a new topic edit in the test group compared to the control group.

Retention Rate

  • There is not a lot of variation between the retention rates observed for the previous add new section link and the new topic tool. We see the highest percentage of Junior Contributors return one week after making an edit with either new topic editing method with similar decreases during week 2 and week 3.
  • There is a slightly lower percentage of new topic tool users that return 1 week after making an edit and a slightly higher percentage that return two weeks after making an edit but it is not significant.

jc_retention_rate.png (2×4 px, 195 KB)

Logged-Out Users KPI Results

Notes:

  • We do not have data on the experience level of an anon user as we do not retain long-term data for these users. As a result, we are not able to account for the impact of experience level on these contributors. The data below reflects all logged-out contributors that were included in the AB test.
  • Some metrics such as retention rate we are not able to accurately calculate given the limited long-term data tracked for anonymous users. Please see the full report for additional details on the methodology used to assess the impact of the new topic tool on logged-out users.

New Topic Completion Rate

new_topic_completes_all_anon.png (2×4 px, 162 KB)

  • Overall, there was a 1.5 percentage point observed increase (8.1% → 9.6%, 18.5%) in the new topic tool completion rate.
  • Trends vary significantly on a per Wikipedia basis; however, most of this variation is caused by smaller sample sizes available for review on a per Wikipedia basis, especially for the previous add new section links events which were sampled at only 6.25%.
  • We modeled the impact to correctly infer the impact of the new topic tool and account for the effects by the user and wiki. Based on estimates from the model, while we observed an increase in the new topic completion rate, there is not sufficient evidence to definitively say that the new topic tool led to this increase.

New Topic Revert Rate

  • Overall, across all participating Wikipedias, we observed a slight 3 percentage point (29.7% → 23.7%; 14.3% increase) in the revert rate for new topic tool edits made by Junior Contributors compared to edits made using the previous add section link.

new_topic_reverts_anon_all .png (2×4 px, 180 KB)

Number of Logged-Out Contributors

There were 12 more distinct anonymous users that successfully completed an edit with the new topic tool compared to the previous add new section link.

MNeisler moved this task from Doing to Needs Review on the Product-Analytics (Kanban) board.
MNeisler added a subscriber: mpopov.

@ppelberg
Here are the completed reports for the logged-in and logged-out New Topic AB test analysis.
Logged-In New Topic AB Test: Report| Summary
Logged-Out New Topic AB Test: Report| Summary

Repo

Reassigning to @mpopov for right now as he has offered to review.

Please let me know if you have any questions or suggested changes. Thank you both!