Page MenuHomePhabricator

Estimate impact of Automoderator on pageviews of vandalism
Closed, ResolvedPublic

Description

In T348861 we found the median pageview count for vandalism revisions. If we restricted this analysis to edits with a Revert Risk score above the thresholds we will use for Automoderator, we could estimate how many pageviews will see (assumed) good content instead of potentially vandalized content.

The thresholds we're planning to start with are 0.975, 0.98, 0.985, and 0.99. For each, on the same sample of wikis as in T348861, please calculate pageviews for revisions above these scores.

Event Timeline

This currently blocked by availability of revert risk scores for Feb 2024 and beyond (T341777#9679533)

KCVelaga_WMF changed the task status from Open to In Progress.Apr 10 2024, 9:12 AM
KCVelaga_WMF updated the task description. (Show Details)

Thank you! This is very interesting.

Could you add an extra column to the data, with the % of total pageviews to vandalised content which we would be preventing? i.e. for the enwiki @ 0.99 line, if I'm understanding correctly we have 38,527 pageviews. 92.7% of them are prevented, giving us 35,715 pageviews prevented. What is this as a % of the 2.37 million pageviews which we found were to potentially vandalised content on enwiki?

This would help contextualise the potential impact.

One followup question - why is it that sometimes a lower threshold has fewer edits? fr.wiki has 88 above 0.985 but 139 above 0.99 if I'm reading the tables correctly.

Could you add an extra column to the data, with the % of total pageviews to vandalised content which we would be preventing? i.e. for the enwiki @ 0.99 line, if I'm understanding correctly we have 38,527 pageviews. 92.7% of them are prevented, giving us 35,715 pageviews prevented. What is this as a % of the 2.37 million pageviews which we found were to potentially vandalised content on enwiki?

Sure, I can do that, but I am not sure if that would be the right context. In the previous analysis, I have used our operational definition to identify which edits are potential vandalism. With some variation by wiki, the average revert risk score for those revisions is 0.9. The way I am thinking of the impact is, given the community configuration allows AM to revert, what proportion of pageviews could have been avoided (assuming AM will revert within a minute). If threshold doesn't allow for AM to revert, those pageviews could be never prevented even if AM is on the wiki (which is mostly beyond our control). Alternatively, we can add more lower thresholds (say 0.9, 0.95), and have two percentage columns percent of views beyond 60 sec and percentage of views (with reference to potential vandalism, that is risk > 0.9).

One followup question - why is it that sometimes a lower threshold has fewer edits? fr.wiki has 88 above 0.985 but 139 above 0.99 if I'm reading the tables correctly.

The counts are mutually exclusive. For 0.975, it is edits with score >= 0.975 and < 0.98. I realized that can be confusing, I will change to it all above a given threshold.

Could you add an extra column to the data, with the % of total pageviews to vandalised content which we would be preventing? i.e. for the enwiki @ 0.99 line, if I'm understanding correctly we have 38,527 pageviews. 92.7% of them are prevented, giving us 35,715 pageviews prevented. What is this as a % of the 2.37 million pageviews which we found were to potentially vandalised content on enwiki?

Sure, I can do that, but I am not sure if that would be the right context. In the previous analysis, I have used our operational definition to identify which edits are potential vandalism. With some variation by wiki, the average revert risk score for those revisions is 0.9. The way I am thinking of the impact is, given the community configuration allows AM to revert, what proportion of pageviews could have been avoided (assuming AM will revert within a minute). If threshold doesn't allow for AM to revert, those pageviews could be never prevented even if AM is on the wiki (which is mostly beyond our control). Alternatively, we can add more lower thresholds (say 0.9, 0.95), and have two percentage columns percent of views beyond 60 sec and percentage of views (with reference to potential vandalism, that is risk > 0.9).

Oh you're right, I forgot that they weren't comparable. Nevermind!

One followup question - why is it that sometimes a lower threshold has fewer edits? fr.wiki has 88 above 0.985 but 139 above 0.99 if I'm reading the tables correctly.

The counts are mutually exclusive. For 0.975, it is edits with score >= 0.975 and < 0.98. I realized that can be confusing, I will change to it all above a given threshold.

That would be helpful, thank you!

@Samwalton9-WMF I updated the notebook to make the counts cumulative of all thresholds above a given threshold.

I think we understand everything we need to about this - this analysis directly lets us know what the approximate impact of Automoderator will be once deployed!

Re-opening, just recalled that we discussed repeating this for a few different weeks to be more certain data for this particular week isn't an anomaly!