Page MenuHomePhabricator

What percentage of new articles are created by auto-patrolled users?
Closed, ResolvedPublic

Description

Let's re-run Tilman's query (T149021#3287887), but instead of looking for auto-confirmed users, look for auto-patrolled users. There isn't an easy way to actually determine if a user was auto-patrolled at the time of article creation, so we'll have to use current user rights as a proxy indicator. (This should be a much simpler query than the one for auto-confirmed.)

This data, combined with the existing data should let us answer (with some caveats) the question of how many new articles (on average) go into the reviewing backlog per day.

Event Timeline

kaldari created this task.May 24 2017, 11:56 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 24 2017, 11:56 PM
kaldari updated the task description. (Show Details)May 24 2017, 11:57 PM

There isn't an easy way to actually determine if a user was auto-patrolled at the time of article creation,

FYI, the Mediawiki user history table in the Data Lake now has information about historical user groups, although I'm not sure if user group additions and removals get their own row.

@kaldari, I imagine you're looking for @Nettrom to do this rather than me?

@Neil_P._Quinn_WMF : I actually ran a query to get similar data on Friday, because I've been using it to figure out how long it takes for articles to get reviewed. My current best version of the query is in our GitHub repository: non_autopatrolled_creations.hql It looks for non-autopatrolled creations, but it's trivial to calculate the opposite proportion as I also have data on all article creations.

It uses the historic user group data, and from what I could find out (e.g. from Wikipedia:Autopatrolled) there are three groups of users that are auto-patrolled: "bot", "sysop" and "autoreviewer". Not 100% sure there haven't previously been other groups, but I couldn't find more about that. The query runs fairly quickly, so I should be able to get updated data fairly easily if I missed something.

@kaldari : If there aren't things I've missed in my data gathering, should I make a plot of this proportion?

there are three groups of users that are auto-patrolled: "bot", "sysop" and "autoreviewer".

@Nettrom: I think you mean "bot", "sysop" and "autopatrolled".

If there aren't things I've missed in my data gathering, should I make a plot of this proportion?

That would be awesome! Just make sure it includes deleted articles as well.

@kaldari : No, I really mean "autoreviewer", ref en:Special:ListGroupRights. I haven't been able to find any documentation that defines the user group in the system as "autopatrolled". And yes, I find that confusing.

Both my datasets include deleted articles. Will get you a graph ASAP.

Ah, looks like "autoreviewer" is the name of the right in the software and "autopatrolled" is the label in the UI. That's confusing.

There's a user group called "autoreviewer" that specifically gets the "autopatrol" user right. That right is also applied to bots and admins. Or at least that's how I read en:Special:ListGroupRights. The help page mentions that it used to be called "autoreviewer", so I guess they just never renamed the user group.

I uploaded a graph of the proportion of autopatrolled article creations to Commons. Let me know if it needs any changes.

@Neil_P._Quinn_WMF : I actually ran a query to get similar data on Friday, because I've been using it to figure out how long it takes for articles to get reviewed. My current best version of the query is in our GitHub repository: non_autopatrolled_creations.hql It looks for non-autopatrolled creations, but it's trivial to calculate the opposite proportion as I also have data on all article creations.

Nice, looks like you're way ahead of me! In that case, I'll take my project off :)

Which Phabricator project should this task be associated with, so others can actually find this task when looking at the project's workboard?

Hmm, Community Tech, I guess.

kaldari closed this task as Resolved.Oct 3 2017, 11:47 PM

It looks like the answer (prior to ACTRIAL) is ~36% of new articles are created by auto-patrolled users.

kaldari moved this task from Untriaged to Archive on the Community-Tech board.Oct 3 2017, 11:48 PM