We need to be able to see what airflow spends its time on to be able to optimize its configuration to allow the ~70000 dumps tasks to be scheduled smoothly.
We can probably use https://github.com/databand-ai/airflow-dashboards as a reference.
We need to be able to see what airflow spends its time on to be able to optimize its configuration to allow the ~70000 dumps tasks to be scheduled smoothly.
We can probably use https://github.com/databand-ai/airflow-dashboards as a reference.
| Subject | Repo | Branch | Lines +/- | |
|---|---|---|---|---|
| airflow: scrape additional metrics | operations/deployment-charts | master | +253 -105 |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Open | None | T88728 Improve Wikimedia dumping infrastructure | |||
| Resolved | BTullis | T352650 WE 5.4 KR - Hypothesis 5.4.4 - Q3 FY24/25 - Migrate current-generation dumps to run on kubernetes | |||
| Resolved | brouberol | T388378 Orchestrate dumps v1 from an airflow instance | |||
| Resolved | brouberol | T390945 Run an experimental dump of 200 regular sized wikis | |||
| Resolved | brouberol | T391332 Create a dashboard displaying airflow pool, scheduler, executor and operator activity |
Change #1135001 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow: scrape additional metrics
Change #1135001 merged by jenkins-bot:
[operations/deployment-charts@master] airflow: scrape additional metrics