Perform a proof-of-concept exercise of having a test dbt job running in the test-k8s Airflow cluster using a customized SimpleSkeinOperator. We can supply the Skein operator with the conda-analytics Python environment that comes equipped with a dbt installation,
Success is:
- A simple SELECT * FROM x dbt model in the dbt-jobs repository
- dbt-jobs repository files manually uploaded to a location on HDFS
- A test Airflow DAG consisting of a single SimpleSkeinOperator performing the following:
- Download dbt-jobs repository from HDFS to local disk
- Activate the conda-analytics environment
- Run the new dbt model
- Observe the Airflow logs of this test DAG and ensure that the operator performed correctly