Page MenuHomePhabricator

[M] Automate Airflow DAG release
Closed, ResolvedPublic

Description

The current workflow to release a data pipeline is:

  1. run the trigger_release GitLab CI job in the data pipeline repo
  2. go to Packages and registries > Package Registry
  3. click on the first item, right-click the asset file name, and copy the URL
  4. branch out of airflow-dags
  5. update current_artifact in the DAG
  6. update the artifact file name and URL in the DAG config
  7. merge into main

Automate steps 2 to 7 through GitLab CI logic: ideally, it should live in trigger_release, so that we can perform a full release with one job.

Event Timeline

MarkTraceur renamed this task from Automate Airflow DAG release to [M] Automate Airflow DAG release.Dec 1 2022, 5:58 PM

I've been thinking about this. Basically we want the GitLab CI job of project A to know how to submit a merge request on project B.

It happens that on our project B, airflow-dags, the process is to update a conda environment artifact for, say, image_suggestions_dag is:
(a) modify https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/platform_eng/config/artifacts.yaml
(b) modify https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/main/platform_eng/dags/image_suggestions_dag.py#L22

So yes, we can automate this, and perhaps parameterize the target dag so we can reuse for other conda environment updates for section-topics, etc.

However it seems to me that the automation would be quite limited:

  1. What if the update also requires changes in the parameters of the DAG tasks?
  2. What if the update happened after a manual hot fix, and whoever did the manual hot fix did not follow the naming convention of the artifact definition? Meaning that the next push would fail to update the target files properly.
  3. Merge into main cannot be automated because repo airflow-dag requires at least one +1 from a Maintainer.

Since automation would be limited, I think this would yield little productivity gain, especially considering that in the last 5 months, we've only done 7 releases of image_suggestions.

@mfossati WDYT?

Hey @xcollazo,

  1. What if the update also requires changes in the parameters of the DAG tasks?

Fair enough. That would entail a manual merge request anyway and fall out of this automation scope.

  1. What if the update happened after a manual hot fix, and whoever did the manual hot fix did not follow the naming convention of the artifact definition? Meaning that the next push would fail to update the target files properly.

Not sure what you mean here. It's just the name of the artifact that has to be changed.

  1. Merge into main cannot be automated because repo airflow-dag requires at least one +1 from a Maintainer.

This sounds like the major automation blocker to me.

My thoughts:

  • the automation is not worth if an airflow-dag human maintainer still has to manually merge
  • automating releases that only change Spark processing (read project A) logic would still be nice to have
  • what about adding specific GitLab CI jobs to the airflow-dags maintainers? A job has a GitLab bot user, e.g., the section topics one is a maintainer.

the automation is not worth if an airflow-dag human maintainer still has to manually merge

and

what about adding specific GitLab CI jobs to the airflow-dags maintainers?

Having a human review changes to airflow-dags before it touches main/production branch is by design, so we couldn't change that.

However, I do think that data-engineering folks, me included, should not block you from merging changes to the platform_eng folder, so we should be able to give merge privileges to structured data folks.

What if the update happened after a manual hot fix, and whoever did the manual hot fix did not follow the naming convention of the artifact definition? Meaning that the next push would fail to update the target files properly.

Not sure what you mean here. It's just the name of the artifact that has to be changed.

Yes, and the script would have to know how to grep for the old one and change it. Thinking a bit more about this, this could be a config file like we do with bumpconfig: https://gitlab.wikimedia.org/repos/structured-data/image-suggestions/-/blob/main/.bumpversion.cfg. Actually, the script could just use bumpconfig for this.

xcollazo changed the task status from Open to In Progress.Jan 19 2023, 3:50 PM
xcollazo claimed this task.

Partial automation has been implemented for image-suggestions via MR https://gitlab.wikimedia.org/repos/structured-data/image-suggestions/-/merge_requests/9.

Details available on the MR, but for completeness, the new deployment steps are as follows:

  1. On the left sidebar, go to CI/CD > Pipelines
  2. Click on the _play_ button and select trigger_release
  3. Wait until (2) is done. Then click on the _play_ button again but this time select bump_on_airflow_dags. This will create a merge request over at the airflow-dags repo.
  4. Inspect the merge request over at airflow-dags, and then merge it.
  5. Deploy the DAGs:
me@my_box:~$ ssh deployment.eqiad.wmnet
me@deploy1002:~$ cd /srv/deployment/airflow-dags/platform_eng/
me@deploy1002:~$ git pull
me@deploy1002:~$ scap deploy

So a bit better than before.

I have left comments on the pipeline logic for possible future generalization of the solution so that other folks could benefit from it.

Since the pipeline does address the issues at hand, I've opted to not pursue the generalization now to keep focusing on structured data work. Closing.