Page MenuHomePhabricator

Product Analytics: Experiment Analysis - Revise Tone Structured Task (WE1.1, FY25-26)
Open, Needs TriagePublic

Description

User story & summary:

Project-level user story:

As a newcomer to Wikipedia, I want to receive suggestions that help me identify and improve non-neutral language in articles so that I can make constructive contributions that align with Wikipedia’s MOS, encyclopedic tone, and neutrality standards.

Task specific user story:

As the Growth team's Product Manager, I want to plan, run, and interpret the "Improve Tone Suggested Edit" experiment, so that we can understand how constructive edits between the treatment group and control group differs.

Guiding Key Result: WE1.1: Increase constructive edits (edits that are not reverted within 48 hours of being published) for editors with less than 100 cumulative edits.

Dependency: T392283: Q1 FY2025-26 Goal: Apply the Tone Check model to published articles, to learn whether we can build a pool of high-quality structured tasks for new editors

Project description:

This project aims to support newcomers by introducing a Suggested Edit that focuses on identifying and improving non-neutral language in articles. Specifically, it will highlight sentences that contain peacock terms, puffery, promotional language, or other wording that conflicts with Wikipedia’s policies on neutrality and encyclopedic tone.
Powered by the Machine Learning “Tone” model (formerly called the “Peacock” model), with UX that builds upon the Edit Check UI, this Suggested Edit will highlight instances of biased language and offer in-context guidance to help users rewrite a sentence in a more encyclopedic tone. The goal is to encourage constructive, policy-aligned contributions while helping newcomers build confidence and awareness of core content standards.
This work builds on the Growth team’s broader strategy to lower barriers to editing through Structured Tasks. In Q1, we aim to release a beta version to lay the groundwork for future experiments evaluating the task’s impact on newcomer contributions and the scalability of Edit Check as a foundation for Suggested Edits.

Project-level Hypothesis:

If we provide newer editors with a Suggested Edit that highlight instances of non-neutral language or improper tone, and offer built-in guidance to rewrite with a more encyclopedic tone, then newer editors will be more likely to make constructive contributions that align with Wikipedia’s policies, while building confidence and awareness of core content standards.​​

Background & research:

Writing in a neutral tone is a pillar of Wikipedia. Writing in a neutral tone is also a practice many new volunteers find to be unintuitive. An October 2024 analysis of the new content edits newer volunteers published to English Wikipedia found:

  • 56% of the new content edits newer volunteers published contained peacock words.
  • 22% of the new content edits newer volunteers published that contained peacock words were reverted

New content edits containing peacock words were 46.7% more likely to be reverted than new content edits without peacock words
Edit_check/Tone_Check#Background

Suggested Edits help new account holders get started editing:

Acceptance Criteria:
  • Final QA of instrumentation before the experiment release
  • Quick check to ensure we are collecting the necessary data after the experiment release; Review experiment and instrumentation for experiment analysis.
  • Review Data Collection Guidelines and obtain/seek L3SC approval as needed
  • Review Data Modeling Guidelines as needed
  • Review Dashboard(ing) Guidelines as needed
  • QA
  • Code Review: user data gathering, edit data gathering, analysis/modeling
  • ✨Perform analysis✨
  • Prepare Quarto report
  • Report review
  • Share draft analysis
  • Review data Reporting guide and Guidelines to ensure compliance
  • Review publishing reports guidelines
  • Determine and add the appropriate license
  • Enter data publication into the data publication log form registry
  • Post notebooks to Gitlab
  • Publish the Quarto report following the web publishing guidelines including the Gitlab repo link
  • Share findings with Growth team and the Community (either via a summary in this task, or a MediaWiki page).
  • Clear all interim tables and csv files per the Data Retention Guidelines

Event Timeline

KStoller-WMF moved this task from Inbox to Backlog on the Growth-Team board.
KStoller-WMF moved this task from Backlog to Needs Estimation on the Growth-Team board.
KStoller-WMF lowered the priority of this task from High to Medium.Oct 30 2025, 10:41 PM
KStoller-WMF renamed this task from Experiment Analysis: Revise Tone Structured Task (WE1.1, FY25-26) to Product Analytics: Experiment Analysis - Revise Tone Structured Task (WE1.1, FY25-26).Nov 12 2025, 5:45 PM

Note that the A/B test won't be released until January, so only the first few items on the Acceptance Criteria are currently actionable:

  • Final QA of instrumentation before the experiment release
  • Quick check to ensure we are collecting the necessary data after the experiment release; Review experiment and instrumentation for experiment analysis.
  • Review Data Collection Guidelines and obtain/seek L3SC approval as needed
  • Review Data Modeling Guidelines as needed
  • Review Dashboard(ing) Guidelines as needed
KStoller-WMF updated the task description. (Show Details)
KStoller-WMF removed a subscriber: Iflorez.
KStoller-WMF added a subscriber: Iflorez.
KStoller-WMF raised the priority of this task from Medium to Needs Triage.Mar 4 2026, 5:32 PM
KStoller-WMF moved this task from Needs Estimation to Blocked on the Growth-Team board.