Develop a ML-based service to predict reverts on Wikipedia(s)
Open, In Progress, HighPublic
Actions

Assigned To

Authored By

	diego
	Aug 2 2022, 12:46 PM

Description

The Research team in collaboration with the ML-Platform team are creating a new service to help patrollers to detect revisions that might be reverted.

Requirments:

One single model for all Wikipedia languages. (use wiki_db as parameter)
Model should be primarly language agnostic (Check the subtasks)
Model will be able to run for single revisions or batches
Model should be able to run in Lift Wing

Please follow the progress of this project on the related tasks.

Related Objects
Search...

Status	Assigned	Task
In Progress	diego	T314384 Develop a ML-based service to predict reverts on Wikipedia(s)
Resolved	diego	T314385 Create a language agnostic model to predict reverts on Wikipedia
Resolved	diego	T314386 Create a multilingual model to predict reverts on Wikipedia
Duplicate	None	T329071 Integration of Revert Risk Scores to Recent Changes as a filter
Resolved	diego	T336421 Revert prediction data request
Open	MunizaA	T343061 Denylist for language agnostic revert risk model
Resolved	fkaelin	T349755 Training pipeline for Revert Risk Language Agnostic (RRLA) model
Resolved	XiaoXiao-WMF	T351897 Set the thresholds Revert Risk models to be used on the Recent Changes Feed (via ORES Extension)

Event Timeline

diego created this task.Aug 2 2022, 12:46 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptAug 2 2022, 12:46 PM

Reedy renamed this task from Develop a ML-based service to predict reverts on Wikipedia(s) to Develop a ML-based service to predict reverts on Wikipedia(s).Aug 2 2022, 12:47 PM

diego changed the task status from Open to In Progress.Aug 2 2022, 12:53 PM

diego claimed this task.

diego triaged this task as High priority.

diego added projects: Research, Epic.

diego updated the task description. (Show Details)

diego added subscribers: calbon, AikoChou, MunizaA.

diego added a subscriber: leila.

diego changed the status of subtask T314385: Create a language agnostic model to predict reverts on Wikipedia from Open to In Progress.Aug 2 2022, 12:58 PM

• Di3sel1975 added a subtask: T314452: VisualEditor returns error in blkwiki.Aug 3 2022, 6:34 AM

Aklapper removed a subtask: T314452: VisualEditor returns error in blkwiki.Aug 3 2022, 6:41 AM

diego moved this task from Backlog to FY2022-23-Research-July-September on the Research board.Aug 31 2022, 3:05 PM

diego edited projects, added Research (FY2022-23-Research-July-September); removed Research.

diego moved this task from FY2022-23-Research-July-September to In Progress on the Research board.Aug 31 2022, 3:08 PM

diego edited projects, added Research; removed Research (FY2022-23-Research-July-September).

ppelberg mentioned this in T317700: Enable product analytics to use revision risk to assess edit quality in feature analyses.Sep 13 2022, 8:12 PM

It has been decided to focus on knowledge integrity risks from two categories of our taxonomy:

Content: prevalence and response to vandalism (using data generated from T314384)
Community: capacity (shortage of resurces in content moderation | admin burnout), governance (barriers to adminship rights) and demographics (geographical diversity of editors/readers)

In T314384#8353933, @Pablo wrote:

It has been decided to focus on knowledge integrity risks from two categories of our taxonomy:

I think this comment shouldn't go on this task.

diego updated the task description. (Show Details)Nov 1 2022, 4:40 PM

For the records here a snippet (by @achou) to try the models from the WMF's cluster

Language-Agnostic:

curl "https://inference.svc.codfw.wmnet:30443/v1/models/revert-risk-model:predict" -d @input.json -H "Host: revert-risk-model.experimental.wikimedia.org" --http1.1 -k

Multilingual:

curl "https://inference-staging.svc.codfw.wmnet:30443/v1/models/revert-risk-model:predict" -d @input.json -H "Host: revert-risk-model.experimental.wikimedia.org" --http1.1 -k

An example for input.json: { "lang": "ru", "rev_id": 123855516 }

kostajh subscribed.Feb 7 2023, 4:23 PM

Updates

Discussing the integration of Revert Risk on MediaWiki: T329071

diego closed subtask T314385: Create a language agnostic model to predict reverts on Wikipedia as Resolved.Apr 12 2023, 3:30 PM

Samwalton9-WMF subscribed.May 2 2023, 9:30 AM

KStoller-WMF mentioned this in T323811: [EPIC] Community configuration 2.0: Factor Community configuration out of GrowthExperiments.May 3 2023, 5:27 PM

Samwalton9-WMF mentioned this in T299436: How impactful would pre-save automoderation be on edit save times?.May 8 2023, 2:51 PM

What does the timeline/roadmap look like for getting this model into Liftwing and available as an API? Our team is considering working on a project next year leveraging this model and it would be helpful to know what the timeline would be.
cc @calbon

Samwalton9-WMF closed subtask T336421: Revert prediction data request as Resolved.May 15 2023, 1:32 PM

I've talked with the ML-Platform folks, and we are going to have this API available within this month. @achou is working on this, and will let us know when the public end-point is available

In T314384#8855349, @diego wrote:

I've talked with the ML-Platform folks, and we are going to have this API available within this month. @achou is working on this, and will let us know when the public end-point is available

Amazing, thanks!

Samwalton9-WMF mentioned this in T336934: Enable communities to configure automated reversion of bad edits.May 18 2023, 1:12 PM

Both models (Language-Agnostic and Multilingual) have been deployed to Lift Wing production. (T332998, T333124) The next step is to work on the public endpoints.

@Samwalton9, will your team be using internal endpoints or public endpoints for the project? Is there any more documentation about this potential project?

achou added a project: Machine-Learning-Team.May 19 2023, 8:16 AM

In T314384#8863625, @achou wrote:

Both models (Language-Agnostic and Multilingual) have been deployed to Lift Wing production. (T332998, T333124) The next step is to work on the public endpoints.

@Samwalton9, will your team be using internal endpoints or public endpoints for the project? Is there any more documentation about this potential project?

Thanks for the update @achou! You can find further information on our plans so far at T336934 or https://docs.google.com/presentation/d/1YiF9rfDKoTvoKVdRUYXAXB2jl6oQhn5-55emQ6TQCg8/edit. We're still in the very early phase of planning on this so I don't have a good answer for you about internal vs public just yet.

calbon moved this task from Unsorted to Watching on the Machine-Learning-Team board.May 30 2023, 2:11 PM

leila moved this task from In Progress to Epics on the Research board.Jul 26 2023, 7:23 PM

diego added a subtask: T349755: Training pipeline for Revert Risk Language Agnostic (RRLA) model.Oct 25 2023, 7:56 PM

diego added a subtask: T351897: Set the thresholds Revert Risk models to be used on the Recent Changes Feed (via ORES Extension).Nov 23 2023, 3:36 PM

diego mentioned this in T341819: Explore alternatives for Revert Risk model improvements for Wikipedia.Jan 22 2024, 4:47 PM

XiaoXiao-WMF closed subtask T351897: Set the thresholds Revert Risk models to be used on the Recent Changes Feed (via ORES Extension) as Resolved.Apr 15 2024, 11:11 PM

To keep this task updated, models for Wikipedia are ready and can be found here:

fkaelin closed subtask T349755: Training pipeline for Revert Risk Language Agnostic (RRLA) model as Resolved.Jul 23 2024, 10:07 AM

Develop a ML-based service to predict reverts on Wikipedia(s)Open, In Progress, HighPublicActions

Description

Related ObjectsSearch...

Event Timeline

Develop a ML-based service to predict reverts on Wikipedia(s)
Open, In Progress, HighPublic
Actions

Related Objects
Search...