- new name (in HDFS): projectview/data/projectview2015/2015-05/projectcounts-20150101-000000
- From the intermediate aggregate hourly pageview table, with the new pageview definition, make flat files that look like this: http://dumps.wikimedia.org/other/pagecounts-all-sites/2015/2015-01/projectcounts-20150101-000000. These files will have per hour per project aggregates of the *new* pageview definition. Careful to make a translation function that works like these examples:
F(project = en.wikipedia, is_zero = true) = en.zero F(project = en.wikipedia, access_method IN (mobile web, mobile app)) = en.m F(project = en.wikipedia) = en --> Take advantage of existing pageview_counts code (https://github.com/wikimedia/analytics-refinery/blob/master/oozie/pagecounts-all-sites/load/insert_hourly_pagecounts.hql )
- Ensure python aggregates those new files and pushes them to the github repository