We should write and test a full Airflow DAG that mirrors a job we have in production.
It should have a SparkSQL (converted from Hive) query and a Spark job.
If we need, we should have 2 jobs that test both of these connectors.
Description
Details
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | • odimitrijevic | T282033 Airflow collaborations | |||
| Resolved | • odimitrijevic | T271429 Replace Oozie with better workflow scheduler | |||
| Duplicate | None | T284172 [SPIKE] analytics-airflow jobs development | |||
| Resolved | mforns | T285692 Write a job entirely in Airflow with spark and/or sparkSQL |
Event Timeline
Change 702668 had a related patch set uploaded (by Mforns; author: Mforns):
[analytics/refinery@master] Add airflow DAG for anomaly detection (POC)
Change 707489 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/puppet@production] airflow - set default smtp settings
Change 707489 merged by Ottomata:
[operations/puppet@production] airflow - set default smtp settings
Change 707517 had a related patch set uploaded (by Mforns; author: Mforns):
[analytics/refinery/source@master] Simplify RSVD anomaly detection job for Airflow POC
Change 708314 had a related patch set uploaded (by Ottomata; author: Ottomata):
[operations/puppet@production] Move airflow-analytics-test instance to an-test-client1001
Change 708314 merged by Ottomata:
[operations/puppet@production] Move airflow-analytics-test instance to an-test-client1001
Change 707517 merged by jenkins-bot:
[analytics/refinery/source@master] Simplify RSVD anomaly detection job for Airflow POC
Change 702668 abandoned by Mforns:
[analytics/refinery@master] Add airflow DAG for anomaly detection (POC)
Reason:
This code has been migrated to https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags