Migrate oozie jobs for Pageview - Learning
This encompasses four oozie jobs:
- oozie/learning/features/actor/hourly
- oozie/learning/features/actor/rollup/hourly
- oozie/learning/predictions/actor/hourly
- oozie/pageview/actor
Also, the learning, features and predictions vocable were used when we were thinking that new "ML" style production jobs would land on the cluster. This has not proven true so far. Do we wish to change those names?
This task needs to be broken down in three:
- Make UDFs work with Spark (multithreading)
- Update HQL files for the jobs to make them spark compliant
- Migrate the job to Airflow using the updated UDFs and HQL files