Page MenuHomePhabricator

Airflow collaborations
Closed, ResolvedPublic

Description

We, Analytics and Platform Engineering, are tasked with building infrastructure to support data pipelines of the form: "crunch some data and publish the results". We chose AirFlow to do the scheduling part of this infrastructure, and we are going to keep in touch as we:

  • catalog and categorize existing and potential future jobs
  • build a minimal set of clean modular templates that can handle these jobs
  • iterate implementing and deploying one job at a time, learning as we go

This can be a parent task to any related work, so we can keep in touch on progress and technical details.

Related Objects

StatusSubtypeAssignedTask
Resolvedodimitrijevic
ResolvedNone
Resolved Clarakosi
Resolvedgmodena
ResolvedOttomata
Resolvedodimitrijevic
Resolvedmforns
Resolvedmforns
ResolvedNone
ResolvedOttomata
ResolvedOttomata
DuplicateNone
Resolvedmforns
Resolvedmforns
Resolvedodimitrijevic
ResolvedAntoine_Quhen
ResolvedAntoine_Quhen
ResolvedSnwachukwu
DuplicateNone
ResolvedSnwachukwu
Resolved NOkafor-WMF
ResolvedAntoine_Quhen
ResolvedSnwachukwu
ResolvedSnwachukwu
ResolvedAntoine_Quhen
Resolved ntsako
Resolvedxcollazo
Declined EChetty
OpenNone
Duplicatemforns
ResolvedOttomata
ResolvedOttomata
DeclinedNone
Resolvedmforns

Event Timeline

You know there is another airflow-common-usage task around, here it is: T237361

@ArielGlenn oh yeah, I remember that. At this point we've pretty much decided to go forward with AirFlow. I read T237361#5636979 and it seems to me you'll be able to do most of this on top of what we build here, but it would require some customization. Our requirements are a bit more basic, I think roughly:

  • testable
  • good UI for managing jobs (rerunning, checking status, etc.)
  • good set of templates that handle common jobs (we would be writing these based on the jobs that get migrated to the system)

So where are you at in the process? You can follow this task until we have something that's more ready for others to try, or you can jump in with us to build the infrastructure, let me know and we can plan accordingly.

<snip>

So where are you at in the process? You can follow this task until we have something that's more ready for others to try, or you can jump in with us to build the infrastructure, let me know and we can plan accordingly.

Well the problem is that I'm in the process of still being the only person on dumps, so it's going to be a good while before I can actually do work. I would definitely be interested in following along on the task though and maybe poking about in how things are configured and rolled out, as it happens.

Ottomata renamed this task from AirFlow collaboration between PE and DE to Airflow collaborations.Jun 3 2021, 1:49 PM