The airflow-dags repository has grown somewhat out of sync with the actual production environment Airflow runs in on Kubernetes, so we are running risk of DAG developers encountering version discrepancy issues when running their DAGs in production.
The dev environment (in airflow-dags) is managing dependencies using Conda, while airflow repository uses standard pip. The dependency versions differ sometimes greatly.
In order to achieve parity between the two, we would like to:
- Use Poetry instead of Conda or pure pip for dependency management
- Synchronize or merge two repositories so that we build on top of layered Docker images instead of having separate images. Prod docker -> test docker -> lint docker for example
- If possible, speed up the creation of these Docker images for purposes of quick testing and linting in GitLab CI
- Update the README.md with instructions on how to build and run pytest and linters locally
- Ideally, implement GitLab git hooks that would run tests and linters on developer machines before committing to the airflow-dags repository