The output of this task should be a catalog of all existing scheduled jobs that Analytics is responsible for, along with other jobs whose owners are interested in migrating to AirFlow. A first-pass at such a catalog is in this document, and the final output doesn't have to be much more precise, we just need a rough idea of what kinds of jobs we have so we can make good templates and implement them cleanly in the new system:
https://docs.google.com/spreadsheets/d/1lfK5Idteh6zPSlCWyH34FJCl_Lcm8401Wm59Jgk-7wM/edit#gid=0
- Oozie
- Reportupdater
- timers/cron
- refine