Airflow is a workflow orchestration system that search platform is trialing as a replacement for oozie. Per discussion with the analytics team we think a Ganeti VM in the analytics network is the right direction forward to deploy this instance.
- setup/install os to airflow1001
- provision mariadb (tbd, may reuse an-coord1001 mariadb)
- provision spark and hadoop client configuration
- deploy search/airflow and wikimedia/discovery/analytics repositories via scap
- initialize databases and verify proper operation
- deploy airflow systemd units
- add kerberos credentials