Deliverable a table in Hive or in MySQL "staging" database which has page views, edit counts, and article properties include:
- page length
- age
- number of inter wiki links
- number of image
- language
- time since last edits
The initial version of the dataset should have last 6 months of data on a monthly granularity.
With this dataset, we will be able to answer questions and create Superset dashboards around:
- Number of articles with certain characteristics in each wiki
- Pageviews or edits counts of articles with certain characteristics in each wiki
- etc...