As per the parent ticket, there are five DAGs running on the platform_eng airflow instance which use WMF Data Workflow Utils to build a runtime conda environment, based on miniconda.
- structured-data/image-suggestions
- structured-data/section-topics
-
structured-data/section-image-recs # Also uses mamba forge.archived project - structured-data/seal # Also uses mamba-forge
- security/differential-privacy
Due to upcoming licence changes in the Anaconda project, we wish to ensure that all of these environments are using the latest version of the workflow utils conda pipeline, in which we switch from miniconda to miniforge.
Please would you take steps to upgrade these DAGs and deploy them, when convenient?
It should just be a case of ensuring the gitlab-ci.yml file references v0.19.0 of repos/data-engineering/workflow_utils as stated here:
https://wikitech.wikimedia.org/wiki/Data_Platform/Systems/Airflow/Developer_guide/Python_Job_Repos#GitLab_CI_setup
Some of the jobs also install mambaforge manually.
This should no longer be required and it would be best to remove this for the same reasons.
There is a reference GitLab MR here, which you may find useful:
https://gitlab.wikimedia.org/repos/data-engineering/example-job-project/-/merge_requests/36/