Page MenuHomePhabricator

Content dataset with pageviews, edits and content properties
Closed, DeclinedPublic

Description

Deliverable a table in Hive or in MySQL "staging" database which has page views, edit counts, and article properties include:

  • page length
  • age
  • number of inter wiki links
  • number of image
  • language
  • time since last edits

The initial version of the dataset should have last 6 months of data on a monthly granularity.

With this dataset, we will be able to answer questions and create Superset dashboards around:

  • Number of articles with certain characteristics in each wiki
  • Pageviews or edits counts of articles with certain characteristics in each wiki
  • etc...

Event Timeline

cchen triaged this task as Medium priority.Nov 17 2020, 11:13 PM
cchen created this task.
cchen moved this task from Triage to Current Quarter on the Product-Analytics board.
kzimmerman changed the subtype of this task from "Spike" to "Task".Aug 30 2021, 11:10 PM

We want to do this work but don't have the bandwidth on our team to support it. There are also dependencies on Data Engineering which we would need to plan for with their support. Keeping this in the backlog, pending future planning.

mpopov subscribed.

Need to get technical requirements first