Now that we managed to organize our dumpsv1 DAG in a way that we feel comfortable with (cf T390852), we can now try to orchestrate the dumps of a larger number of wikis. We settled on the arbitrary number of 200.
Description
Details
Event Timeline
brouberol merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1212
test_k8s/dumpsv1: introduce a way to exclude certain wikis from a dag run
brouberol merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1214
Increase the max active tasks from 6 to 16, to speed up the DAG execution
Change #1134985 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow: set saner performance-related configs
Change #1134985 merged by Brouberol:
[operations/deployment-charts@master] airflow: set saner performance-related configs
Change #1135001 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow: scrape additional metrics
Change #1135419 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow: increase pool metrics computation frequency
Change #1135001 merged by jenkins-bot:
[operations/deployment-charts@master] airflow: scrape additional metrics
Change #1135419 merged by jenkins-bot:
[operations/deployment-charts@master] airflow: increase pool metrics computation frequency
I think that we can call this done.
We have now split our DAGs so that we have around ~145 wikis per DAG and I had two runs of different groups complete successfully yesterday.
