Page MenuHomePhabricator

Investigate some historical trend on Persian Wikipedia
Open, MediumPublic



@Huji raised a few questions in T292781, which worth investigating. They are

  1. total edit

The total edit count line for 2019 (green dotted line) looks very jumpy. I wonder if this is because it includes edits by a currently non-bot but formerly bot account. Can you check if those spike in Sep-Dec 2019 are mainly from one account?

  1. blocks

Noticeable decrease in number of blocks.
Number of blocks from 2019 is causing the Y axis to go to 12K. I am guessing we imported a lot of IP blocks in April 2019 and again in July 2019. This might be my own bot even (HujiBot). Can we exclude bot-issued edits in this metric altogether? Both of our bot admins are still bots and admins (HujiBot and Dexbot).

  1. number of pages protected

For the number of pages protected metric, it is the 2021 pre-IP-ban data which causes problem with the Y axis. Again, I am guessing this was a bot related thing (without having checked, I would guess some admin bot went ahead and protected a lot of highly used templates). Can we exclude bot admins from this data?

To do

Investigate each question and document in this ticket. If needed, update the metric definition in weekly metrics dashboard.

Event Timeline

Huji renamed this task from Investigate some historical trend on Farsi Wikipedia to Investigate some historical trend on Persian Wikipedia.Dec 14 2021, 1:24 AM
  1. Number of pages protected

For the number of pages protected metric, it is the 2021 pre-IP-ban data which causes problem with the Y axis. Again, I am guessing this was a bot related thing (without having checked, I would guess some admin bot went ahead and protected a lot of highly used templates). Can we exclude bot admins from this data?

Investigation note:
The bump in October 2021 is mainly from one user who protected 5867 pages in a month. The user is in sysop user group, not in bot or botadmin group. Our dashboard and code are public to all. I prefer not to exclude this user in the analysis code so that no individual info is disclosed. I suggest to benchmark with September 2021 data for Month-to-Month or Week-to-week comparison . Meanwhile, we can benchmark with November 2020 /2019 for YoY comparison.

jwang triaged this task as Medium priority.Dec 14 2021, 6:12 PM
jwang moved this task from Triage to Upcoming Quarter on the Product-Analytics board.
Aklapper subscribed.

@jwang: Removing task assignee as this open task has been assigned for more than two years - see the email sent to all task assignees on 2024-04-15.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome! :)
If this task has been resolved in the meantime, or should not be worked on by anybody ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see for tips how to best manage your individual work in Phabricator. Thanks!