Page MenuHomePhabricator

Epic ⚡️ : Parse ANI for data
Closed, ResolvedPublic


Looking at the months of April and May 2017 for ANI archives, look at

The primary page that people edit on is — This is where you will find contribution history, if needed.

After 72 hours of inactivity, threads are automatically archived via bot. The archives for April and May 2017 span archives 950 through 956. A lot of what we want might be able to be grepped from the wikitext.

The data that we are looking for is:

General stats for April 1 to May 31 2017

  • Number of cases started, determined by the number of h2 sections with an initial signature on or between April 1 to May 31 2017.
  • Number of unique users (as determined by signatures?)
  • Number of unique admins (as determined by signatures, crossreffed with the admin table?)

Case specific stats (each h2 section is a different case)

  • Case title (h2 text)
  • anchored URL to case
  • Username of who filed the case
  • Calendar date the case was filed
  • List of usernames who participated in the case
  • List of admins who participated in the case
  • Number of unique users (signatures?)
  • Number of unique admins (signatures?)
  • Does the case contain the 'resolved' templates {{atop| and {{abot}}
  • Does the case contain these keywords
    • harass
    • hound or stalk
    • coi
    • 3RR

Distribution of users on ANI from April 1 to May 31, 2017

  • Username, admin status (is/not), and number of cases where they participated

Related Objects

Event Timeline

Create script that retrieves data from the API

Parse single value fields into a database

  • title, URL, username who filed case, calendar date of file, resolved, contains keywords

Build database schema

Parse for lists

  • list of usernames, list of admins, number of unique users, number of unique admins

Generate CSVs

Generate graphs

TBolliger renamed this task from Parse ANI for data to Epic: Parse ANI for data.Jul 19 2017, 7:05 PM
TBolliger renamed this task from Epic: Parse ANI for data to Epic ⚡️ : Parse ANI for data.