Analyse reversion activity by anti-vandalism bots
Closed, ResolvedPublic
Actions

Description

We want to learn more about existing solutions for automated anti-vandalism so that we can make informed decisions for Automoderator.

Nine Wikipedia communities have developed bots which automatically revert edits based on algorithms - generally machine learning models. We want to understand how much of the anti-vandalism burden these bots take on in their communities so that we make informed judgements about the potential impact of Automoderator.

Project	Bot
en.wiki	ClueBot NG
es.wiki	SeroBOT
fr.wiki & pt.wiki	Salebot
fa.wiki	Dexbot
bg.wiki	PSS 9
simple.wiki	ChenzwBot
ru.wiki	Рейму Хакурей
ro.wiki	PatrocleBot

Questions

In each of the Wikimedia projects above, how many reverts does their anti-vandalism bot make per day, on average?
What is this as a percentage of all reverts made within 24 hours of an edit occurring?

Expanded scope:

Initial data shows that on average 10% of the bot reverts are reverted back. However, it may be that the vandal actors whose edits had been reverted were reverting them back. So we want to investigate further here, to understand the following:

How of many of the reverts are possibly vandalism? Not all reverted edits are necessarily vandal edits. While there is no direct way to determine this, exploring revert rate by user segmentation might be helpful comparative data (registered vs. anonymous, split edit buckets etc.)
Who are reverting the reverts of the anti-vandal bots?
PatruBOT was the first anti-vandal bot on Spanish Wikipedia before SeroBot. Analyse the false positive rate of PatruBot before it was shutdown

Related Objects

Mentioned In: T348869: How many edits would Automoderator revert per day at different caution levels?
T343953: What is the breakdown of reverts made by humans & bots by their ORES score?
T342096: What is the edit revert rate on Indonesian Wikipedia?

Event Timeline

Samwalton9-WMF created this task.Jul 14 2023, 10:09 AM

Restricted Application added subscribers: Strainu, Aklapper. · View Herald TranscriptJul 14 2023, 10:09 AM

KCVelaga_WMF claimed this task.Jul 14 2023, 10:34 AM

KCVelaga_WMF triaged this task as Medium priority.

KCVelaga_WMF added a project: Product-Analytics (Kanban).

KCVelaga_WMF edited projects, added Moderator-Tools-Team (Kanban); removed Moderator-Tools-Team.Jul 17 2023, 1:57 PM

KCVelaga_WMF moved this task from Ready to In Progress on the Moderator-Tools-Team (Kanban) board.

Samwalton9-WMF moved this task from Backlog to Data/Research on the Automoderator board.Jul 18 2023, 9:42 AM

Samwalton9-WMF mentioned this in T342096: What is the edit revert rate on Indonesian Wikipedia?.Jul 18 2023, 10:02 AM

@Samwalton9 I have a clarification question for the second question,

What is this as a percentage of all reverts made within 24 hours of an edit occurring?

I couldn't specifically understand, what "24 hours of an edit occurring" meant.

Do we need the percentage of reverts made by the bot within 24 hrs, of all the reverts the respective bots made?
If we know all the bot reverts will happen within 24 hrs (or are not concerned with that), then the question can be
1. Of all the reverts made on average per day, what percentage of them are by these bots?
2. which can potentially tell us how much of the workload is handled by anti-vandal bots on the respective wikis

Also, do we want to look at only the content namespaces of the respective wikis or all namespaces? I see some of these bots are also working on non-content namespaces, like talk, project namespaces, etc. That will depend on which namespaces will automoderator monitor. If it is all, then I get the data for all namespaces and maybe analyse the data by content vs. non-content, or we could just look at the content namespaces.

Great questions, thanks for digging in to this!

As I understand it, the bot reverts should all be happening very quickly (within minutes), so I wanted to constrain our analysis of human reverts to the ones made within 24 hours of the reverted edit occurring. In this way we're not comparing the very fast bot reverts to reverts a community might make, say, a month later. Those reverts are probably happening through a different process, i.e. not patrolling brand new edits, but rather looking at maintenance template categories or something.

So I think we want to know - how many reverts happen per day, where the time between edit and revert is less than 24 hours. How many bot reverts happen per day (we assume they all happen very quickly). And what is the ratio of these numbers. In this way we're comparing the bot to the 'fast patrolling' reverts, rather than all reverts.

Also, do we want to look at only the content namespaces of the respective wikis or all namespaces? I see some of these bots are also working on non-content namespaces, like talk, project namespaces, etc. That will depend on which namespaces will automoderator monitor. If it is all, then I get the data for all namespaces and maybe analyse the data by content vs. non-content, or we could just look at the content namespaces.

Good point. I think we should just look at the main namespace since Automoderator will only operate there (the model only supports main namespace).

In T341857#9033821, @Samwalton9 wrote:

Those reverts are probably happening through a different process, i.e. not patrolling brand new edits, but rather looking at maintenance template categories or something.

If I understand correctly, you want to obtain a measure for "how much of the RC patrolling is done by bots?". If this is the case, I strongly suggest not to use the same 24h interval for wikis as different as enwp and rowp. There is an approach that I feel would give you more meaningful results here: Identify the RC parameteres used by the bot, then use the average duration covered by the same parameters on that wiki (for instance, without parameters, 500 RC pages cover ~14h on rowp during school months, and up to ~20h during the summer).

That average duration might be hard to find retroactively, so at least take the current interval with the defaults offered to new users when accessing RC: create a new users, access RC, select 500 changes (or whatever the maximum is) and see how far back the changes go. I believe in all cases the interval will be significantly smaller than 24h.

In T341857#9034003, @Strainu wrote:

In T341857#9033821, @Samwalton9 wrote:

Those reverts are probably happening through a different process, i.e. not patrolling brand new edits, but rather looking at maintenance template categories or something.

If I understand correctly, you want to obtain a measure for "how much of the RC patrolling is done by bots?". If this is the case, I strongly suggest not to use the same 24h interval for wikis as different as enwp and rowp. There is an approach that I feel would give you more meaningful results here: Identify the RC parameteres used by the bot, then use the average duration covered by the same parameters on that wiki (for instance, without parameters, 500 RC pages cover ~14h on rowp during school months, and up to ~20h during the summer).

That average duration might be hard to find retroactively, so at least take the current interval with the defaults offered to new users when accessing RC: create a new users, access RC, select 500 changes (or whatever the maximum is) and see how far back the changes go. I believe in all cases the interval will be significantly smaller than 24h.

Thanks for sharing this insight @Strainu! I think we should stick with the 24 hour window because we're not only interested in Special:RecentChanges - we also want to consider reverts that happen from venues like your watchlist (mine goes back a few days) and page histories. We're just mostly interested in the short timeframe reverts rather than the long ones. Sticking to a consistent 24 hours will also make it easier for us to compare this data with Automoderator directly, since we won't have to re-calculate RC timeframes again in the future!

Update:

@Samwalton9 and I discussed the initial results (as mentioned below) and decided it would be best to expand the scope of this task to investigate further.

wiki_db	bot	all_reverts	bot_reverts	reverted_bot_reverts	n_days	bot_reverts_percent	reverted_bot_reverts_percent
bgwiki	PSS 9	58.0	4.0	0.3	947	7.18	8.35
enwiki	ClueBot NG	6576.0	358.0	33.6	1090	5.44	9.41
eswiki	SeroBOT	2127.0	897.0	78.6	1092	42.17	8.76
fawiki	Dexbot	489.0	224.0	24.5	439	45.81	10.96
frwiki	Salebot	731.0	26.0	2.9	1067	3.62	11.09
rowiki	PatrocleBot	59.0	8.0	0.8	429	13.14	10.19
ruwiki	Рейму Хакурей	709.0	66.0	8.3	1071	9.37	12.53
simplewiki	ChenzwBot	89.0	16.0	1.9	890	18.06	11.55

A couple of notes:

All the reverts that took place within 24 hrs of an edit taking place, and counts are average per day (excludes non-content namespaces and page creations)
By default I considered three years of data (July 2020 to June 2023), however, I have removed records for days where the bots haven't had any reverts, as they tend to skew the results.
reverted_bot_reverts and the percentages can be helpful to determine how many of the bot reverts were reverted back. However, we can't yet be completely sure to use this data to determine the false positive rate of these bots.

@Samwalton9 I have updated the task description to reflect our discussion. Please add if I missed anything, or change as needed.

KCVelaga_WMF moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.Aug 2 2023, 8:54 AM

@Ladsgroup @Superzerocool It looks like SeroBOT and Dexbot handle a surprisingly high % of reverts on their respective wikis compared to other bots (see the bot_reverts_percent column above. I'm wondering if you have any thoughts on why that might be?

I think one big part of it for Persian is that the bot reverts but then the IP reverts the bot and so on. I have seen people complaining that it basically destroyed article history by reverting ten times back to back. We are planning to roll out automatic semi-protect in case of more than 3 reverts in 24 hours and that might change some numbers?

In T341857#9066169, @Ladsgroup wrote:

I think one big part of it for Persian is that the bot reverts but then the IP reverts the bot and so on. I have seen people complaining that it basically destroyed article history by reverting ten times back to back. We are planning to roll out automatic semi-protect in case of more than 3 reverts in 24 hours and that might change some numbers?

Oh that's interesting - thanks for that context. I know some of the other bots will only revert an editor once per page (per 24 hours or so), which maybe avoids this issue.

Samwalton9-WMF mentioned this in T343953: What is the breakdown of reverts made by humans & bots by their ORES score?.Aug 10 2023, 9:01 AM

@Samwalton9

Here's the summary of further analysis.

Bot reverts split by user-type:

wiki_db	bot	user_type	all_reverts	bot_reverts_percent	reverted_bot_reverts_percent
bgwiki	PSS 9	anon	44	8.7	4.9
		registered	10	26.8	17.9
enwiki	ClueBot NG	anon	4069	6.2	8.5
		registered	1243	7.0	11.9
eswiki	SeroBOT	anon	1666	49.7	8.7
		registered	189	15.1	22.3
fawiki	Dexbot	anon	385	69.9	11.1
		registered	64	6.9	3.3
frwiki	Salebot	anon	477	4.0	9.8
		registered	129	4.5	14.8
rowiki	PatrocleBot	anon	44	17.4	8.6
		registered	12	17.8	9.1
ruwiki	Рейму Хакурей	anon	507	12.5	13.0
		registered	88	2.9	15.5
simplewiki	ChenzwBot	anon	65	24.2	11.3
		registered	15	17.2	17.1

For further split by edit bucket: https://github.com/wikimedia-research/moderator-tools-FY24/blob/main/%5BT341857%5D%20anti_vandal_bots_reverts/data_outputs/anti_vandal_bot_revert_aggregates.tsv

Although the data appears that in most cases, the bot reverts percentage is greater on registered users, further exploring the data split by edit bucket (linked above) reveals that these reverts happen mostly on users with "0-99" edit bucket, indicating newcomers. A high revert rate is observed on experienced contributors as well (500+ edits), but these are outliers, which took place on a few days during the three-year time period and should not be considered to interpret general activity.

Percentage of reverted bot edits reverted back by the same editor whose edit had been initially reverted

user_type	edit_bucket	bot_reverts_by_base_editor_percent
anonymous	n/a	45.87
registered	0-99	57.54
registered	100-499	80.0
registered	500+	71.72

Detailed breakdown: https://github.com/wikimedia-research/moderator-tools-FY24/blob/main/%5BT341857%5D%20anti_vandal_bots_reverts/data_outputs/reverted_bot_reverts_aggegrates.tsv

Experienced users are more likely to revert back to a bot revert. Although it can't be said for sure with the data we currently have, it may also indicate the newcomers may not know about reverting back or reporting if there is a false positive.

Notebook and aggregated data by data: https://github.com/wikimedia-research/moderator-tools-FY24/tree/main/%5BT341857%5D%20anti_vandal_bots_reverts

As we discussed, we can conclude this analysis for here now, and revisit if we want to investigate further with follow-up questions once the measurement plan takes shape.

KCVelaga_WMF closed this task as Resolved.Aug 16 2023, 1:18 PM

KCVelaga_WMF moved this task from Doing to [Deprecated] Done (previously: Needs sign-off) on the Product-Analytics (Kanban) board.

This is great, thanks @KCVelaga_WMF!

Samwalton9-WMF mentioned this in T348869: How many edits would Automoderator revert per day at different caution levels?.Oct 13 2023, 2:33 PM

Analyse reversion activity by anti-vandalism botsClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

Analyse reversion activity by anti-vandalism bots
Closed, ResolvedPublic
Actions