Page MenuHomePhabricator

How impactful would pre-save automoderation be on edit save times?
Closed, ResolvedPublic


The Moderator Tools team plans to build an 'automoderator' - T336934: Enable communities to configure automated prevention or reversion of bad edits. This tool would enable communities to use machine learning models to automatically prevent or revert vandalism. It could hypothetically run before an edit is saved, or after.

We have assumed, based on previous discussions, that a pre-save check would take too much time to be feasible, negatively impacting edit save times too significantly. But we'd like to explore this in more detail to confirm whether this is the case.

The concrete ask our team has is: How impactful would checking edits against the Language-agnostic revert risk model and/or Multilingual revert risk model be on edit save times?

Additionally, from the 2022 Community Wishlist Survey:

Problem: Abuse Filters are a great way of preventing problematic edits before they happen. However, guessing "problematic" is currently done using user segmentation, common phrases used by vandals etc. We have a much better tool to determine if an edit is destructive: ORES. If we were able to prevent all edits above a certain threshold, the workload on patrollers would be significantly reduced and, possibly, would prevent some communities from requesting and all-out IP-editing ban.

This would have to be blazing fast, and not use any mediawiki API or prediction pre-cache. But I think it could be do'able and is a concrete ask from the community so we should see what we could do.

Event Timeline

Some thoughts about this:

  1. We could trade prediction speed for more model performance (accuracy, f1, etc) by trimming the branches of any decision trees.
  2. It'd have to take wikitext as an input, no calls in an external api.
  3. It couldn't use any pre-cache.
Samwalton9-WMF added a subscriber: Samwalton9-WMF.

@calbon Re-opening because I'd find it valuable if you could investigate this topic again for some work our team is exploring for next year. Details below, let me know what you think!

As part of our research into which moderator workflows we should prioritise next year we learned about community desire for (and existing implementations of) algorithmic reversion or prevention of obviously bad edits. This is already being done by technical volunteers on larger wikis through bots such as ClueBot NG, SeroBOT, and Salebot. The Research team has been working on T314384: Develop a ML-based service to predict reverts on Wikipedia(s), and so we could imagine leveraging this service to prevent or revert content across more projects, and in a more effective way than existing ORES-based bots.

As far as I can tell there are two avenues we could go down for this: pre-edit prevention or post-edit reversion. Post-edit reversion seems like a clear project - we could build a bot like those I linked above and plug it into an API for this model. But preventing edits before they're made might be desirable instead (or also, there are pros and cons to both approaches). AbuseFilter already has a wide range of configuration options, so this could save us development time if we could simply plug it into an existing extension rather than building something from scratch.

I've seen hesitation about leveraging AbuseFilter for this purpose because filters need to run before an edit is saved, and we're hesitant about increasing edit save times. To help us make a decision about which approach to take, I'd love to know if this is actually a blocker, or if we could imagine a system which is performant enough that we wouldn't substantially increase edit save times.

@Samwalton9 while researching my own revertbot [1] I made a similar proposal to the Community Whishlist which triggered a discussion very similar to this one: [2]


Samwalton9-WMF renamed this task from Live Vandalism Detection to How impactful would pre-save automoderation be on edit save times?.Jun 2 2023, 12:19 PM
Samwalton9-WMF updated the task description. (Show Details)

I think @calbon has looked into this far enough that we understand this is a viable approach, but we've decided to go the route of reverting edits, to increase transparency in Automoderator's decision-making.