Page MenuHomePhabricator

Update the product-analytics DAGs to use miniforge instead of condaforge
Closed, ResolvedPublic

Description

As per the parent ticket, there are two DAGs running on the product-analaytics team's airflow instance which use WMF Data Workflow Utils to build a runtime conda environment, based on miniconda.

Due to upcoming licence changes in the Anaconda project, we wish to ensure that all of these environments are using the latest version of the workflow utils conda pipeline, in which we switch from miniconda to miniforge.

Please would you take steps to upgrade these DAGs and deploy them, when convenient?

It should just be a case of ensuring the gitlab-ci.yml file references v0.19.0 of repos/data-engineering/workflow_utils as stated here:
https://wikitech.wikimedia.org/wiki/Data_Platform/Systems/Airflow/Developer_guide/Python_Job_Repos#GitLab_CI_setup

There is a reference GitLab MR here, which you may find useful:
https://gitlab.wikimedia.org/repos/data-engineering/example-job-project/-/merge_requests/36/

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
switch to miniforgerepos/product-analytics/automoderator-metrics-jobs!5kcvelagaswitch-to-miniforgemain
Switch to moderation job scripts miniforgerepos/product-analytics/moderation-mariadb-jobs!6kcvelagaswitch-to-miniforgemain
Customize query in GitLab

Event Timeline

mpopov triaged this task as High priority.

Mentioned in SAL (#wikimedia-operations) [2024-11-14T13:21:08Z] <kcvelaga@deploy2002> Started deploy [airflow-dags/analytics_product@c5ab766]: T379546

Mentioned in SAL (#wikimedia-operations) [2024-11-14T13:21:48Z] <kcvelaga@deploy2002> Finished deploy [airflow-dags/analytics_product@c5ab766]: T379546 (duration: 00m 54s)

KCVelaga_WMF changed the task status from Open to In Progress.Nov 14 2024, 1:28 PM

I deployed, paused, reparsed and unpaused all the DAGs - everything looks good. Also, deleted the old packages from the package registry.