Page MenuHomePhabricator

[Airflow] Organize hackathon
Closed, ResolvedPublic

Description

Organize a hackathon related to the Airflow project.

Ideas for goals that we could want to achieve:

  • Get everybody in the team (or other teams!) familiar with the Airflow code, documentation, and the way of writing DAGs.
  • Give a push to the Oozie->Airflow migration, so we can migrate as many jobs as possible in a couple days.
  • Discover and document corner cases, gotchas, and in general improve documentation where it's lacking, and open Phab tickets for things to improve.

People that might want to participate:

  • We Data Engineering team (especially developers/data engineers).
  • Other WMF Airflow developers/data engineers that want to learn more about our Airflow setup and/or want to develop DAGs on it.
  • People that want to help accelerating the Oozie->Airflow migration?

Structure (tentative and not in order):

  • 4 days.
    • Day 1:
      • Introduction to hackathon (goals, structure, communication channels, docs, sync meetings, questions etc.)
      • Go over list of Oozie jobs to migrate or tasks to do (people will receive them in advance).
      • Make teams and assign tasks/jobs.
      • Introduction to Airflow DAGs and gotchas (optional).
      • Start implemetation.
    • Day 2:
      • Continue implementation and testing.
      • Sync meeting(s?).
      • Introduction to testing (optional).
      • Troubleshooting/code-review room/channel.
    • Day 3:
      • Finishing implementation, continue testing and start deployments.
      • Sync meeting(s?).
      • Introduction to deployments (optional).
      • Troubleshooting/code-review room/channel.
    • Day 4:
      • Finish testing and deployments.
      • Troubleshooting/code-review room/channel.
      • Closing meeting: Celebrate progress and things learned. Showcase/demos. Bring out learnings/gotchas and other comments (oozie vs airflow thoughts?).

When?

  • May 16th to 19th

TODOs before the hackathon

  • Make sure documentation is comprehensive enough for people not familiar with Airflow to catch up.
  • Create a list of tasks/jobs to migrate.
  • Prepare an introduction to DAGS and gotchas.
  • Prepare an introduction to testing.
  • Prepare an intro to deployments.

Event Timeline

mforns triaged this task as High priority.Nov 5 2021, 9:48 PM
mforns moved this task from Backlog to Discussed (Radar) on the Data Pipelines board.
mforns moved this task from Backlog to Estimated on the Data Pipelines board.
mforns moved this task from Next Up to In Progress on the Data-Engineering-Kanban board.