Page MenuHomePhabricator

Move the dumps_v1 DAGs from the Airflow test_k8s instance to the main instance
Open, MediumPublic

Description

Whilst carrying out the work on T352650, we developed the DAGs on the kest-k8s Airflow instance.

The reason for this initial configuration is that, at that time, we were using novel features such as the KubernetesPodOperator that was not available on all instances.

Now that the Dumps_v1 processes are in production, we should migrate the DAGs from the test-k8s instance to another instance.
The purpose of the test-k8s instance is for running integration test DAGs and for testing new features, so it should not really be running production pipelines.

I feel that the obvious choice is the main instance, which is where the majority of the DAGs that are the responsibility of the Data-Engineering team are running.

There will be certain required changes:

  • Ensure that the resource limits applied to the airflow-test-k8s namespace are replicated in airflow-main
  • Ensure that the RBAC policies will allow airflow to launch tasks in the mediawiki-dumps-legacy namespace
  • Move the DAGs from /test-k8s to /main within the Airflow-DAGs repository.

Event Timeline

Gehel triaged this task as Medium priority.Sep 23 2025, 1:20 PM
Gehel subscribed.

Do we want to have a dedicated Airflow instance for Dumps?