Page MenuHomePhabricator

[Airflow] Migrate Druid loading Oozie jobs - Parent task
Closed, ResolvedPublic0 Estimated Story Points

Description

Migrate all Oozie Druid-loading jobs to Airflow.
This task assumes we already have at least 1 Druid-loading job working in Airflow, that can be used as a reference.
The workload per Oozie job to migrate should be the following:

  • Write the DAG (copy-paste of the existing one, and modify according to the specific dataset and its data sources) and its unit tests.
  • Test the DAG in a dev-instance.
  • Code review.
  • Deployment, kill the old Oozie job, enable the new one.
  • Test that all is working in prod.

Event Timeline

@mforns / @JAllemandou can you please fill in the details of this one to make it easier for folks to size? Thank you!

EChetty set the point value for this task to 9.Jan 16 2023, 4:18 PM

Change 886114 had a related patch set uploaded (by Mforns; author: Mforns):

[analytics/refinery/source@master] Support snapshot partitioning in HiveToDruid and DataFrameToDruid

https://gerrit.wikimedia.org/r/886114

Change 886114 merged by jenkins-bot:

[analytics/refinery/source@master] Support snapshot partitioning in HiveToDruid and DataFrameToDruid

https://gerrit.wikimedia.org/r/886114

This is temporarily in review to get opinions on the way I handled the delayed daily timetable interacting with our datasets idea. It's kind of hard coded but I think simpler than a more flexible approach. Let me know what you think. (assigned to Sandra but anyone is welcome to comment)

https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/217

Change 904609 had a related patch set uploaded (by Mforns; author: Mforns):

[operations/puppet@production] modules::profile::manifests::airflow.pp: add plugins_folder path

https://gerrit.wikimedia.org/r/904609

This task was underestimated, and we should split it up in subtasks, one per Druid job.
Will do so and set all of those tasks as children of this one.

mforns removed mforns as the assignee of this task.Apr 5 2023, 3:07 PM
mforns changed the point value for this task from 9 to 0.
mforns renamed this task from [Migration] Oozie jobs for Druid to [Airflow] Migrate Oozie jobs for Druid - Parent task.Apr 5 2023, 3:18 PM
mforns renamed this task from [Airflow] Migrate Oozie jobs for Druid - Parent task to [Airflow] Migrate Druid loading Oozie jobs - Parent task.

Change 904609 merged by Stevemunene:

[operations/puppet@production] modules::profile::manifests::airflow.pp: add plugins_folder path

https://gerrit.wikimedia.org/r/904609