This task involves the work of running a controlled experiment of Reference Check that is isolated to English Wikipedia.
This test:
- Builds on the previous A/B test we ran of Reference Check at 15 Wikipedias (T342930)
- Is a response to:
Experiment timeline
| Milestone | Target Completion Date | Responsible | Status | Notes |
|---|---|---|---|---|
| Start test | ✅ | Editing Engineering | ||
| Verify test bucket balancing (T406134 ) | 13 November 2025 | Editing QA + @MNeisler | ||
| Publish leading indicators report (T405421) | 3 Dec. 2025 | @Iflorez | Analysis can begin ~2 weeks after test starts | |
| End test | 4 December 2025 | Editing Engineering | Analysis can begin ~6 weeks after test starts | |
| Publish final report | 19 December 2025 | @Iflorez | ~2 weeks after analysis starts | |
| Share statistically significant conclusions on-wiki | January 2026 | @Sdkb | ||
Decision(s) To Be Made
- 1. Will Reference Check be enabled by default newcomers editing with VE at en.wiki? If so, how – if at all – would experienced volunteers like the Check's default configurations to be changed?
- Enabled: Yes
- Configuration: TBD; @Sdkb is preparing an announcement to start a discussion about this
Hypotheses
| ID | Hypothesis | Metric(s) for evaluation |
|---|---|---|
| KPI | The amount of constructive edits newcomers will publish will increase because a greater percentage of edits that add new content will include a reference or an explicit acknowledgement as to why these edits lack references. | 1) Proportion of published edits that add new content and include a reference or explicit acknowledgement of why a citation was not added, 2) Proportion of published edits that add new content (T333714) and are constructive (read: NOT reverted within 48 hours ) |
| Curiosity #1 | Newcomers will be more aware of the need to add a reference when contributing new content because the visual editor will prompt them to do so in cases where they have not done so themselves. | Increase in the proportion of newcomers that publish at least one new content edit that includes a reference. |
| Curiosity #2 | Newcomers will be more likely to return to publish a new content edit in the future that includes a reference because Reference Check will have caused them to realize references are required when contributing new content to Wikipedia. | 1) Proportion of newcomers that publish an edit Reference Check was activated within and successfully and return to make an unreverted edit to a main namespace during the identified retention period., 2) Proportion of newcomers that publish an edit Reference Check was activated within and return to make a new content edit with a reference to a main namespace during the identified retention period. |
Leading indicators
T405421: [A/B Test] Report on Reference Check (en.wiki) leading indicators
Guardrails
This section describes the metrics we will use to make sure other important parts/dimensions of the "editing ecosystem" are not being negatively impacted by Reference Check. The scenarios named in the chart below emerged through T325851.
| ID | Name | Metric(s) for Evaluation |
|---|---|---|
| 1) | Edit quality decrease (T317700) | Proportion of published edits that add new content and are still reverted within 48hours. Will include a breakdown of revert rate of published edits with and without a reference added. |
| 2) | Edit completion rate drastically decreases | Proportion of edits that reach the point Reference check was shown or would be shown that are successfully published (event.action = saveSuccess) |
| 3) | People shown Reference Check are blocked at higher rates | Proportion of contributors blocked after publishing an edit where Reference Check was shown |
| 4) | High false positive or false negative rates | A) Proportion of new content edits published without a reference and without being shown Reference check (indicator of false negative) & B) Proportion of contributors that dismiss adding a citation and select "I didn't add new information" or other indicator that their edit doesn't require a citation |
A/B Test: Decision Matrix
| ID | Scenario | Indicator(s) | Plan of Action |
|---|---|---|---|
| 1) | Reference Check is disrupting, discouraging, or otherwise getting in the way of volunteers who are attempting to make edits in good faith. Read: people are less likely to publish the edits they start. | Significant drop in edit completion and spike in edit abandonment in edit sessions where Reference Check is activated. Will include breakdown to review edits where reference reliability check was included. | Pause scaling plans; investigate changes to UX |
| 2) | Reference Check is increasing the likelihood that people will publish destructive edits | Increase in proportion of contributors blocked after publishing an edit where Reference check is activated, Increase in proportion of published edits where Reference check was activated and are reverted within 48 hours relative to new content edits Reference check was NOT activated within. | Pause scaling plans, review edits to try to identify pattern in abuse and propose changes to UX to mitigate them |
| 3) | Reference Check is causing people to publish edits that align with project policies | Increase in the proportion of edits Reference check was activated within that include a reference and are not reverted within 48 hours relative to new content edits without a reference edit check was NOT activated within | Move forward with scaling plans |
| 4) | Reference Check is effective at causing people to accompany new content edits that include a reference, but those references are unreliable | Increase in the proportion of published edits Reference check was activated within that include a reference and increase in the proportion of these edits that are reverted within 48 hours | Block scaling plans and consider mitigations to address reference reliability (e.g. T276857) |
| 5) | Reference Check is not effective at causing people to accompany new content edits that include a reference but is not disrupting to volunteers. | No change or decrease in the proportion of published edits Reference check was activated within that include reference and A) no significant drop in edit completion or abandonment rate or B) no significant spike in block or revert rate | Move forward with scaling plans |
Done
- Analysis is complete and report published | @Iflorez
- All "Decisions to be made" are addressed and documented | @ppelberg
- Experiment results are published on-wiki (mw:Edit_check/Reference_Check#Experiment_results)) | @Iflorez