Page MenuHomePhabricator

What is the distribution of revert risk scores for extended confirmed users?
Closed, ResolvedPublic

Description

One of the features of an edit we could filter on is user groups - we already know for certain that administrators should never be reverted, but it's less clear-cut whether users with other user rights should be. One of the complications is that user groups are different on different wikis - it would be hard for us to hard-code them all into Automoderator and maintain them over time. We probably want to give this to each community to configure in detail.

One user right which is available on many wikis is extendedconfirmed. The right is granted automatically, typically after 500 edits and 30-90 days of activity depending on the wiki. We could imagine filtering out extendedconfirmed users by default too, but we need to check the data to confirm whether this would improve Automoderator's accuracy.

Questions
For enwiki, fawiki, jawiki, and zhwiki, while ignoring the edits we already want to skip*:

  • What is the distribution of Revert Risk scores for all users and extended confirmed users?
  • How many edits by extended confirmed users have a Revert Risk score greater than 0.97?

Please also produce a data file containing the edits with scores greater than 0.97 so that we can spot check the accuracy of this score. This can be randomly sampled down to ~1000 edits per wiki if there are more edits than this.

*Edits made by administrators; Edits made by bots; Edits which are self-reverts; New page creations

Event Timeline

Extended confirmed is actually only used on ~15 wikis, so this may not be a high priority role for us to investigate.

KCVelaga_WMF changed the task status from Open to In Progress.Jan 8 2024, 11:38 AM
KCVelaga_WMF claimed this task.

@Samwalton9-WMF

  • Overall, very few edits by extendedconfirmed users had a revert risk score of greater than 0.97.
    • ~1800 edits in the year 2022 across enwiki, fawiki, jawiki, zhwiki.
  • The median revert risk score for extendedconfirmed users was ~0.2 whereas for other users it was ~0.75.
  • For edits where revert risk for extendedconfirmed users was greater than 0.97, in approximately 60% of the cases, the edit was reverted.
  • Given the scale of edits by extendedconfirmed users with risk greater than 0.97, excluding or not excluding (as a global setting) will most likely not have any significant impact on the accuracy of Automoderator.

  • the notebook tables with breakdown of distribution for various user groups.
  • data file of edits with scores greater than 0.97

@Samwalton9-WMF

  • Overall, very few edits by extendedconfirmed users had a revert risk score of greater than 0.97.
    • ~1800 edits in the year 2022 across enwiki, fawiki, jawiki, zhwiki.
  • The median revert risk score for extendedconfirmed users was ~0.2 whereas for other users it was ~0.75.
  • For edits where revert risk for extendedconfirmed users was greater than 0.97, in approximately 60% of the cases, the edit was reverted.
  • Given the scale of edits by extendedconfirmed users with risk greater than 0.97, excluding or not excluding (as a global setting) will most likely not have any significant impact on the accuracy of Automoderator.

  • the notebook tables with breakdown of distribution for various user groups.
  • data file of edits with scores greater than 0.97

I've spot-checked 30 edits from this data file and while some were reverted, I don't think any of them were so unambiguous that they should have been automatically reverted. Extended confirmed seems like a good user group to exclude, even if it won't have a huge impact on the number of reverted edits. It would avoid some potentially high profile false positives. If we had a configuration option to avoid certain user groups, perhaps we could have EC there as a default (but removable) option.

A quick question, before I share this elsewhere - you wrote that 1800 edits were >0.97 in 2022. Was that from all edits made on those wikis, or a subset? What is it as a percentage?

@Samwalton9-WMF Great, that sounds reasonable.

  • 1800 edits made by extendedconfirmed users were >0.97 in 2022, do you mean a percentage of
    • all edits (irrespective of user right) greater than >0.97
    • or, of all edits made by extendedconfirmed users (irrespective of the risk)
    • or, just simply all edits

all edits (irrespective of user right) greater than >0.97 :)

@Samwalton9-WMF

wiki_dbAll Edits ( risk > 0.97)Percentage of edits with risk > 0.97 by extended confirmed users
jawiki288660.51% (148)
zhwiki194310.7 % (136)
enwiki5507640.26% (1455)
fawiki336370.23 (78)

In total, of 632698 edits with risk greater than 0.97, 0.29% (1817 edits) were made by extendedconfirmed users.

Edits made by administrators; Edits made by bots; Edits which are self-reverts; New page creations were excluded from all the counts.