Page MenuHomePhabricator

Oozie job to compute geowiki on top of sqooped data
Closed, ResolvedPublic13 Estimated Story Points

Description

From the meeting with Asaf and the analytics-internal thread titled "data output by geowiki", the decision was to keep track of data as granular as the current geowiki, but to only report monthly lines of the form:

month, # of editors making 1+ edits, # of editors making 5+ edits, # of editors making 100+ edits

This task is to create an oozie job to compute and load data into Druid, and a Superset dashboard to present the data.

Event Timeline

Milimetric created this task.
Milimetric moved this task from Next Up to In Progress on the Analytics-Kanban board.

Change 413265 had a related patch set uploaded (by Milimetric; owner: Milimetric):
[analytics/refinery@master] Compute geowiki statistics from cu_changes data

https://gerrit.wikimedia.org/r/413265

Change 413265 merged by Milimetric:
[analytics/refinery@master] Compute geowiki statistics from cu_changes data

https://gerrit.wikimedia.org/r/413265

How do we ensure this job runs after the partition job?