Page MenuHomePhabricator

Understanding the usage and impact of the Flagged Revisions extension
Closed, ResolvedPublic

Description

Goals:
We are currently drafting the following hypothesis: If we produce a report about the usage and impact of the Flagged Revisions extension, we will understand the use cases, value, and impact of pre-publication edit review, enabling the Moderator Tools team to make informed decisions about future work.

This hypothesis will directly inform work that our team carries out either this FY or next. We feel that the area of pre-publication edit review is poorly understood, as evidenced by FlaggedRevs languishing in its current state of 'deployed but no longer deployable'. We can't make decisions about what to do with this extension, or whether pre-publication edit review should be integrated more widely, without first understanding the impact FlaggedRevs has.

Provide details about your request here:
The Flagged Revisions extension (https://meta.wikimedia.org/wiki/Flagged_Revisions) is used in nearly 50 Wikimedia projects. It has two primary featuresets, patrolling tools and, more controversially, limiting visibility of new edits from readers until an experienced editor has reviewed the edit. We (the WMF) have in the past theorised that this has a negative impact on new editor growth, despite support from experienced contributors. As a result we stated that we will no longer deploy the extension on more wikis. As far as I can tell we've never justified this with solid data, research, or other kind of reporting. Our team would like to generate a report on FlaggedRevs and its impact so that we can make informed decisions about what to do with the software, and what aspects of it we might want to retain or disable.

We envision this user research having two strands:

  • Reader/new editors - impact of hiding new edits until reviewed. What are non-editors expectations of moderation on our wikis? Do they expect their edit to be live right away? Are they dissuaded from editing because moderation will happen? How long do they expect review to take? How do they feel about making an edit that doesn't go live immediately?
  • Experienced editors - How do experienced editors feel about pre-publication review? What is their experience of using the FlaggedRevs extension to patrol edits (its other major feature)? Do they think it has a positive impact on their community? What's frustrating about FlaggedRevs? If FlaggedRevs went away, what features would they need to replace it?

Prior insights
Community discussions

Research and data

Deliverable: This research, and its findings, are now publicly available through its Meta-Wiki page and Results sub-page.

Details

Due Date
Mar 31 2025, 12:00 AM

Event Timeline

Hi @Samwalton9-WMF, I'm curious if you've had a chance to read a paper by Tran et al that's just come out (https://dl.acm.org/doi/pdf/10.1145/3686954). As you reviewed/review that, would you be able to comment on to what degree that work addresses some of the questions you have captured in this ticket? Anything specific in regards to to what degree it does - or doesn't - address your questions would be helpful to know.
Separately, as this is an existing product, curious to know if you already have any work underway (or completed) to quantitatively measure potential impacts of the extension; for example, via impact on metrics we have available to us, etc. The reason I ask is because it could be that such work could be a very nice complement to the work described in this ticket. Anything you can comment on in this regard? Also, CCing @mpopov, is this a topic that product analytics has worked on in the past and/or has considered prioritizing in upcoming quarters? It's potentially an interesting area of collaboration we might consider if conditions allow.

Hi @Samwalton9-WMF, I'm curious if you've had a chance to read a paper by Tran et al that's just come out (https://dl.acm.org/doi/pdf/10.1145/3686954). As you reviewed/review that, would you be able to comment on to what degree that work addresses some of the questions you have captured in this ticket? Anything specific in regards to to what degree it does - or doesn't - address your questions would be helpful to know.

I haven't! I did read their previous work and found it very interesting though, so I'll give this a read.

Separately, as this is an existing product, curious to know if you already have any work underway (or completed) to quantitatively measure potential impacts of the extension; for example, via impact on metrics we have available to us, etc. The reason I ask is because it could be that such work could be a very nice complement to the work described in this ticket. Anything you can comment on in this regard? Also, CCing @mpopov, is this a topic that product analytics has worked on in the past and/or has considered prioritizing in upcoming quarters? It's potentially an interesting area of collaboration we might consider if conditions allow.

@KCVelaga_WMF did some work on measuring FlaggedRevs backlogs in T348863 and I compiled a dataset about FlaggedRevs' configurations and backlog sizes/wait times. We also noted the impact of FlaggedRevs on the number of readers who see vandalised content in T348861#9728894.

In case you're interested in what the dewiki community thinks about FlaggedRevs ("Gesichtete Versionen"), you may have a look at a survey held in 2020: https://de.wikipedia.org/wiki/Wikipedia:Umfragen/Abschaffung_der_Gesichteten_Versionen_(%E2%80%9EAutorenschwund%E2%80%9C). Some participants also share thoughts about possible modifications/improvements of the status quo.

In case you're interested in what the dewiki community thinks about FlaggedRevs ("Gesichtete Versionen"), you may have a look at a survey held in 2020: https://de.wikipedia.org/wiki/Wikipedia:Umfragen/Abschaffung_der_Gesichteten_Versionen_(%E2%80%9EAutorenschwund%E2%80%9C). Some participants also share thoughts about possible modifications/improvements of the status quo.

Thanks - this is very interesting! I'm going to start a section in the task description for discussions + research so we have prior insights listed in one place.

@KCVelaga_WMF did some work on measuring FlaggedRevs backlogs in T348863 and I compiled a dataset about FlaggedRevs' configurations and backlog sizes/wait times. We also noted the impact of FlaggedRevs on the number of readers who see vandalised content in T348861#9728894.

In addition, we also have a pipeline, that we developed as part of T362615, to aggregate a couple of metrics related to Flagged Revisions (number of flagged revisions pending to be reviewed and average time taken for flagged revisions to be reviewed) calculated at the end of each hour, on all Wikipedias where FlaggedRevisions is enabled on. The data is available at wmf_product.moderation_flagged_revisions_pending_hourly - it can be used to establish baselines and observe how these aggregates change over time.

leila triaged this task as Medium priority.Dec 20 2024, 12:11 AM
leila added a project: Essential-Work.
leila moved this task from Backlog to Staged on the Research board.
leila set Due Date to Mar 31 2025, 12:00 AM.Dec 20 2024, 12:13 AM

@Samwalton9-WMF we have prioritized this task as part of our essential work and I expect that we can move it to "In progress" in the first half of Q3. We're still working on resourcing it but we're confident we can do it in the coming quarter.

Hi @Samwalton9-WMF, I'm curious if you've had a chance to read a paper by Tran et al that's just come out (https://dl.acm.org/doi/pdf/10.1145/3686954). As you reviewed/review that, would you be able to comment on to what degree that work addresses some of the questions you have captured in this ticket? Anything specific in regards to to what degree it does - or doesn't - address your questions would be helpful to know.

I think this is very interesting! I had read their previous paper but hadn't spotted this one. I think these user interview insights are really valuable - they go some way to answering the editor-facing questions we had about perceptions, though as they note the non-English participation is relatively limited so I think we have scope to do a better job there. The paper doesn't answer any of the reader-facing questions we have.

They raise some interesting questions, like 'What is the benefit of Flagged Revisions to patrollers on wikis where it doesn't prevent pageviews?' which is one of the topics I alluded to in the task description.

Easikingarmager raised the priority of this task from Medium to High.
Easikingarmager moved this task from Staged to In Progress on the Research board.

We are starting to scope out directions for this project. Given our due date, we will focus primarily on the experiences of experienced editors who currently use Flagged Revisions as a part of their contributions. Recent discussions on Pending Changes on English Wikipedia are very thought-provoking, and Pending Changes is related to Flagged Revisions, but for scoping purposes we are most likely to focus on wikis using FlaggedRevs in override mode. This is subject to change depending on how data collection goes, but for now we can say that we're focusing on the experiences of editors, their use of this extension and their perceptions of its effects on their communities.

Update - continuing to work on problem definition and finding existing (community-produced) literature on the topic.

In preparation for the kickoff meeting, we've started a discussion guide. I expect finalization of research methods and goals to happen by the start of next week.

Our kickoff meeting has concluded and we have sketched out the following phrases:

  • Preparation: Feb 21 - Mar 3 (discussion guide development, recruitment planning, logistics framework setup)
  • Recruitment and interviews: Mar 3 - Mar 21 (active recruitment, interviews in parallel)
  • Analysis and deliverable writing: Mar 24 - Mar 31

Our main goal for the study is to understand the impact of FlaggedRevisions on the patrolling workflow, and editing experiences, of experienced wiki editors. To that end, we will target wikis that use three different modes of Flagged Revisions:

  • Overwrite, where readers see the stable revision of a page across the whole site by default. Currently we are looking German and Polish Wikipedias.
  • Protect, where readers see the stable revision of certain pages, rather than all pages by default. Currently we are looking at English Wikipedia.
  • Disabled, where the stable revision features are not in use but the extension is installed and in use. Currently we are looking at Ukrainian and Finnish Wikipedias.

Given the compressed timeframe of this study, we will focus on finding interview participants who are comfortable speaking English. @Samwalton9-WMF is assisting by querying recent (<1mo) users of FlaggedRevisions features to aid us in recruitment (thanks Sam!)

Brief update: 3 research sessions conducted this week, with more scheduled for next week.

We have concluded the interview portion of this study. We reached a total of 6 editors: two from German, two from Finnish, one from English, and one from Ukrainian Wikipedia. While this was a little lower than our desired coverage for English, we managed to get interviews with FlaggedRevs users from all three identified modes.

cwylo updated the task description. (Show Details)
cwylo updated the task description. (Show Details)

Our share-out is concluded and the results have been publicly documented, through its Meta-Wiki page and Results sub-page.