Page MenuHomePhabricator

Analyse the reverts on trwiki that were not reverted by Automoderator
Closed, ResolvedPublic

Description

As identified in Automoderator pilot metrics (T362610), AM only handles less 1% of the total revert workload on Turkish Wikipedia (weekly average since deployment). The team is planning to increase the coverage.

To data, it would be helpful to understand more about the reverts not handled by AM while it is enabled. Having answers to the following questions would be helpful for the team to make a decision:

  • Are there any edits not reverted by AM above the currently set 0.99 threshold? If yes, what are they?
  • What is the average risk score of edits reverted by users but not Automoderator?
  • Count of reverts not reverted at various risk thresholds (0.985, 0.98, 0.975, 0.97, 0.95, 0.9, 0.85, 0.8, 0.75)
    • What proportion of those reverts were reverted back? (potential false positives)

Event Timeline

KCVelaga_WMF moved this task from Triage to Current Quarter on the Product-Analytics board.
KCVelaga_WMF added a subscriber: DMburugu.

@jsn.sherman @DMburugu @SonjaPerry @Scardenasmolinar We initially discussed to do this after the June snapshot of revert risk scores become available, which they have now. As we are in the third week of July already, I think it is better to wait until the end of the month, and do the analysis based on the whole month of July. As Automoderator was enabled on trwiki on June 26, for June, we only have about 3-4 days of data to work with, which may not give us the full picture we need. Let me know what you all think.

@jsn.sherman @DMburugu @SonjaPerry @Scardenasmolinar We initially discussed to do this after the June snapshot of revert risk scores become available, which they have now. As we are in the third week of July already, I think it is better to wait until the end of the month, and do the analysis based on the whole month of July. As Automoderator was enabled on trwiki on June 26, for June, we only have about 3-4 days of data to work with, which may not give us the full picture we need. Let me know what you all think.

I think that sounds good; Note that autmoderator was disabled for 3 days on trwiki (July 16-19) while we resolved T370161: "Call to a member function equals() on null" when rolling back a change with a suppressed username.

It would also be helpful to have a dataset of some (at least 100) edits to investigate manually - sometimes it can just be insightful to see these data points individually.

@KCVelaga_WMF Can you give an ETA on when this can be done?

@jsn.sherman I am planning do this on priority during the week of Aug 19. I will on leave on from 8 to 16 Aug.

KCVelaga_WMF changed the task status from Open to In Progress.Aug 22 2024, 7:46 AM
KCVelaga_WMF moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.

An update: I started working on this, but identified a potential data issue. About 3000 reverts on Turkish Wikipedia don't have an associated revert risk score at risk_observatory.revert_risk_predictions. I have flagged it to the engineers on the Research team.

During July 2024, on Turkish Wikipedia, for potential vandalism:

  • Automoderator made 39 reverts, with an average revert risk score of 0.994.
  • Total reverts by users (both registered and anonymous) were 4,874.
  • There were 81 edits that were not reverted by Automoderator, that had revert risk greater than 0.99.
    • 91% of those reverts were made by registered users, the rest by anonymous users.
    • Please refer to this spreadsheet for the data related to those reverts (private; internal to WMF only).
  • For reverts made by users, the average revert risk of score of the edits reverted is 0.92.
  • For users, the potential false positive rate of the reverts is 10%, while Automoderator's false positive rate is around 7%.

Please refer the notebook for breakdown by user type, and also the reverts by revert risk bins.

Let me know if there are any follow-up questions.

Let me know if there are any follow-up questions.

We started talking a little about this on Slack but I definitely think it's curious that there were twice as many >0.99 reverts that were not reverted than were reverted. From a spot check some of these are understandable, but for others there's no discernible reason for us to have missed reverting the edit:

  • 16 edits (20%) were made between the 16th and 19th July, when Automoderator was temporarily disabled.
  • Of the first 21 edits (I didn't have time to check all), 3 (14%) were reverts of automoderator, 9 (42%, though note that this was a block of activity on the same page, so perhaps not representative) were self-reverts, and 1 was reverted within 1 minute, which has the potential for an edit conflict with Automoderator.

For others I can't initially see a reason for us not to have reverted, for example: 1, 2, 3.

This Logstash report shows us where Automoderator errored during operation. I don't see these three edits in there.