Page MenuHomePhabricator

Sample content dataset to share in the Product Leads meeting
Closed, ResolvedPublic

Description

Deliverable
Create content datasets to explore in Superset around following use cases:

  • Top topics on wikipedia by pageviews
  • Top topics on wikipedia by edits categorized by editor type (anonymous editor vs registered editor )

Acceptance Criteria

  • datasets can be manually updated (automating is out of scope)
  • pageviews and editors datasets are updated monthly, through the end of Q4
  • Pageviews: Pageviews, project, country, topics, date
  • Edits: Edits, project, bot/non-bot, editor type (anonymous editor vs registered editor), topics, date
  • datasets are QAed (code review, reasonableness checks)
  • datasets can be explored in Superset
  • demo of results/data in Product Leads meeting
  • data dictionary entries for staging tables, with links to relevant code

Deliveries:

  • Slide deck presented in Product Leads meeting: link

Pageview data:

Edit data:

Details

Other Assignee
jwang

Event Timeline

cchen triaged this task as Medium priority.May 4 2021, 9:02 PM
cchen created this task.
kzimmerman changed the subtype of this task from "Spike" to "Task".May 4 2021, 9:12 PM
cchen edited projects, added Product-Analytics (Kanban); removed Product-Analytics.
cchen updated the task description. (Show Details)
cchen moved this task from Next 2 weeks to Blocked on the Product-Analytics (Kanban) board.
cchen moved this task from Blocked to Doing on the Product-Analytics (Kanban) board.
cchen updated the task description. (Show Details)