Now that the FR Tech airflow instance is set up, we need a way to move data between data lakes
related to T417213: Create FR Tech Airflow instance and T405360: Implement an Airflow operator for moving data from point A to B
Now that the FR Tech airflow instance is set up, we need a way to move data between data lakes
related to T417213: Create FR Tech Airflow instance and T405360: Implement an Airflow operator for moving data from point A to B
Here is an implementation idea that @AStein-WMF, @amastilovic and I discussed that can potentially solve this use case:
Use case details:
Note this mechanism will only cover the use case of moving data from HDFS to S3-compatible targets, so much more constrained than T405360.
@AStein-WMF, @amastilovic did I miss something?
DPE SRE: Does the above idea sound reasonable? CC @BTullis.
This looks good to me! thanks for writing it up! I assume next step is for @BTullis to review and give his thoughts- but lmk if there's anything i can do to help!