Page MenuHomePhabricator

Analytics Engineer has an oozie job to aggreate page views by time
Closed, DuplicatePublic

Description

Input: the refined logs in Hive (rows that were determined to be a page view are tagged as such)
Output: similar to pagecounts-all-sites (though the input for that job is different)

Code for pagecounts-all-sites oozie jobs:
https://github.com/wikimedia/analytics-refinery/tree/master/oozie/pagecounts-all-sites
https://github.com/wikimedia/analytics-refinery/blob/master/oozie/pagecounts-all-sites/load/insert_hourly_pagecounts.hql