Goal:
Migrate the mediawiki-history-dumps jobs to Airflow (Spark job):
Job Details:
| Input | Processing | Output |
| Hive Table | Spark | Archive |
Success Criteria:
- Have the 1 monthly Job Migrated (SLA 35 days)
Goal:
Migrate the mediawiki-history-dumps jobs to Airflow (Spark job):
Job Details:
| Input | Processing | Output |
| Hive Table | Spark | Archive |
Success Criteria:
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | • odimitrijevic | T282033 Airflow collaborations | |||
| Resolved | • odimitrijevic | T271429 Replace Oozie with better workflow scheduler | |||
| Resolved | • odimitrijevic | T299074 Migrate Oozie jobs to Airflow | |||
| Resolved | xcollazo | T300344 Low Risk Oozie Migration: Mediawiki History Dumps |
Delivered via:
https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/127
and
https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/128
Opened follow up task T316371 to migrate to Spark3.
Verified working on prod. Closing.