Page MenuHomePhabricator

Vet and explore new readership retention metric
Closed, ResolvedPublic

Description

(Task intended to form part of a possible Outreachy internship in data analysis with the WMF reading team)

Vet and explore a new privacy-friendly web readership retention metric (based on a data source selected by the Reading team), and build a reporting mechanism for it.
Deliverables:

  1. (~ 2 weeks) A report examining various possible specifications of this metric (e.g. choice of percentile, etc.), their possible data quality issues and suggestions how to fix or mitigate them, and an assessment of their sensitivity and robustness
  2. (~ 1 week) An exploratory analysis showing how the chosen metric differs across various dimensions, e.g. project language or geographical region
  3. (~ 1 week) A workflow or an automated tool to regularly inform the Reading team and the Wikimedia movement on how this metric is developing

Event Timeline

This project is currently in progress. I've vetted different specifications of this metric (averages and various percentiles) and examined across different dimensions (device type, Wikipedia project languages). I still need to finalize the queries to calculate the metric, create more charts, and set up an oozie job to automate updating the metric.

MBinder_WMF triaged this task as Medium priority.Aug 2 2018, 8:17 PM
JKatzWMF added a subscriber: Zareenf.

the next step here is for @Tbayer to break up the remaining items on this task as necessary for someone to work on

Resetting task assignee as the user is not active here anymore.

kzimmerman claimed this task.
kzimmerman subscribed.

Resolving since the subtasks were resolved.

Also, due to data issues identified with the metric (as well as maintenance concerns), we decommissioned the pipeline.