As part of our model testing process (T342641) we need datasets of edits and Revert Risk model scores for the wikis we want feedback and input on.
Projects
Top 150 Wikipedias per Wiki comparison data
Datasets
We want 25,000 random edits per project, along with their Revert Risk score. These should only be article namespace edits. Additionally, we want to include data for each edit, which broadly match dimensions on which Automoderator will avoid edits:
- Is the edit a self-revert? (i.e. the edit is a revert of an edit made by the same user)
- Is the edit a page creation?
- Was the edit made by a bot?
- Is the user an administrator?
- Does the edit have the newcomer task links tag?
- Does the edit have the contenttranslation tag?