Page MenuHomePhabricator

Proof of Concept: RecSys for Patrollers
Closed, ResolvedPublic

Description

While certain instances of vandalism or policy violations can be relatively easily detected using Machine Learning/AI-based tools, most sophisticated patrolling requires human judgment. The goal of this project is to study the feasibility of using AI-based tools to simplify the way that patrollers discover and prioritize content that requires their attention.

In this task, we are going to create and test a recommender system that matches revisions with patrollers. To do this, we are going to rely on existing ML/AI tools, such as the Revert Risk and Peacock (also known as Tone Check) language detection model scores, to identify revisions that require patrolling. Unlike the Automoderator or Tone Check tools that focus on cases with really high scores (e.g., score > 0.95)—i.e., in cases where the vandalism or policy violation is very clear—in this case, we are going to consider milder scores (e.g., 0.5 < score < 0.9), and try to identify a set of experienced users that can review that content.

Copy of RecSysForPatrollers presentation 4 product.png (540×960 px, 90 KB)

The input for this RecSys will be: revision_id, score_p, where score_p is the probability returned by some of the aforementioned models. The output of the RecSys will be a set of N users that are likely to review the revision.

For ground truth, we are going to use the revision history and try to predict which user patrolled a given revision.

The goals for Q1 FY25-26 are to:

This work is highly related to proposals by @fkaelin and @Pablo T392210. I'm going to coordinate with them to avoid duplicating efforts.

Details

Due Date
Oct 9 2025, 11:00 PM

Event Timeline

Miriam triaged this task as Medium priority.Aug 26 2025, 12:27 PM
Miriam set Due Date to Oct 9 2025, 11:00 PM.

The work was done. Check links on the task description.

Main Conclusions

  • Predicting users' patrolling behavior (what revision their going to patroll next) is easier than predicting general editing behavior (what article they are going to edit next).
  • There two main patrolling behaviors, depending in the nature of the patrolling task. Here we study two types of patrolling actions: Reverts and adding patrolling templates.
    • Content focused: People tends to revert more in articles they have edited previously. There just few users focused on reverts across multiple articles they haven't edited in the past.
    • Task focused: There users focused on adding templates related to certain policies. In this research we show there certain users that focus on adding/removing the {{peacock}} independently for the article.
  • Therefore, our suggestion is to build two types of RecSyS, one task based and other content based.
  • With these conclusions we can build a recommender system that finds editors that can benefit for getting personalized recommendations that combines their interests with outcomes of the ML-models for content integrity.

Copy of RecSysForPatrollers presentation 4 product.png (540×960 px, 90 KB)

diego updated the task description. (Show Details)

We got feedback on the datasets created in T408739:

Initial interest
Prior to any data being shared, interest in the project was high, with editors intrigued by the idea and stating that such a feature would likely be useful to them. Many had questions about how exactly the system would work. Questions included:

  • Would it analyse both their content creation and patrolling activities?
  • How far back in their editing history would it analyse?
  • Would it be biased towards recent activity?
  • Will it find edits to pages in topic areas they edit, as well as the specific pages they contributed to?

Testing feedback
Editors had feedback about the pages being selected for them to review. Although they thought it made sense to see edits to the precise pages they’ve edited in the past, many editors expressed a desire to see edits from other pages in the same topical area, or linked from the page they edited. Experienced editors are likely to already have pages they’ve edited on their watchlist, or otherwise be aware of the editing activity happening on them, so these recommendations were less useful.

  • Can we incorporate pages that are connected to, or in the same topical area, as pages the user has edited?

Some pages were ones where the patroller had reverted vandalism on the page once, and they were otherwise not necessarily interested in the page in general.

  • Can we select pages based on more significant editing experience, discounting or deprioritising pages where the user’s only edit is a single revert?

Response to the edits being presented was generally positive - most users said they thought the majority of edits were worth reviewing, even in cases where the edit itself didn’t need to be reverted.

A few specific highlighted issues that could be addressed in future iterations would be to not include:

  • Edits made by experienced editors (e.g. other patrollers)
  • Edits made by the user we’re generating suggestions for.
  • Edits which have already been reverted.

An additional question for thought:

  • Can we incorporate some of the parameters this model uses as configurable parameters so users can customize their patrolling experience?