Page MenuHomePhabricator

๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Measure the effectiveness of blocks
Closed, ResolvedPublic

Description

This is a parent task for the Anti-Harassment Tools team (including @nettrom_WMF) to measure the effectiveness of sitewide and partial blocks at stopping harm to Wikimedia wikis.

More information can be found at https://meta.wikimedia.org/wiki/Community_health_initiative/Measuring_the_effectiveness_of_blocks


Data points to gather

Number of users who

  • receive a sitewide block
  • receive a sitewide, non-indefinite block
  • have a sitewide block which expires
  • have a sitewide block which expires, then do not receive another sitewide block
  • have a sitewide block which expires, then do not receive another sitewide block, who make 1+ edit
  • receive a partial block
  • have a partial block who make 1+ edit
  • have a partial block who do not receive a sitewide block OR have pages added to their partial block OR have their expiration date extended

Number of pages which

  • are protected OR have their protection level escalated

This data should be available in the Data Lake on a monthly basis in the logging tables. We'll also have T209549: Add ipblocks_restrictions table to Data Lake if needed

Event Timeline

TBolliger triaged this task as Medium priority.Nov 13 2018, 6:39 PM
TBolliger created this task.
Restricted Application added subscribers: MGChecker, Aklapper. ยท View Herald TranscriptNov 13 2018, 6:39 PM
TBolliger moved this task from Untriaged to Snackbox on the Anti-Harassment board.Nov 13 2018, 6:39 PM
TBolliger moved this task from Backlog to User blocking on the MediaWiki-User-management board.
TBolliger updated the task description. (Show Details)Nov 13 2018, 9:38 PM
TBolliger updated the task description. (Show Details)Nov 15 2018, 10:39 PM

I've created a GitHub repo where I'll put notebooks and graphs for analysis: https://github.com/nettrom/AHT-block-effectiveness-2018

aezell added a subscriber: aezell.Dec 14 2018, 6:34 PM

@nettrom_WMF Are you using IPython alone or within Jupyter?

@aezell I'm using it within JupyterLab (on SWAP).

nettrom_WMF moved this task from Triage to Doing on the Product-Analytics board.Dec 20 2018, 6:35 PM
TBolliger moved this task from Snackbox to Backlog on the Anti-Harassment board.Jan 30 2019, 11:05 PM

@Niharika : I picked this up again last week. At this point, I'd like to wait until partial block data is in the Data Lake to continue the work, because then I'll get block duration and edit revert detection for free rather than handle those myself. It would also be great to have IP blocks in the Data Lake, because so far a lot of the partial blocks are of IPs.

In other words, I'd prefer to wait until T211950 and T211627 are completed. Let me know if that's a problem.

@Niharika : I picked this up again last week. At this point, I'd like to wait until partial block data is in the Data Lake to continue the work, because then I'll get block duration and edit revert detection for free rather than handle those myself. It would also be great to have IP blocks in the Data Lake, because so far a lot of the partial blocks are of IPs.
In other words, I'd prefer to wait until T211950 and T211627 are completed. Let me know if that's a problem.

That sounds good to me, @nettrom_WMF. Do you know who's responsible for those tasks and what the ETA on those being completed might look like?

@Niharika : I'm not sure who on the Analytics Engineering team is responsible, and I noticed that neither of the tasks are assigned to anyone. My current understanding is that these changes are likely to arrive with the next snapshot, which should be available in a few days, or the one after that (in early May).

Okay. ๐Ÿ‘

@Niharika can we call this particular task resolved, since you've reported on results at Wikimania & in the Year in Review?

Niharika closed this task as Resolved.Sep 17 2019, 11:46 PM

@Niharika can we call this particular task resolved, since you've reported on results at Wikimania & in the Year in Review?

Yes, thank you!