Page MenuHomePhabricator

Investigate how to run more than 16 parallel dump task instances
Closed, ResolvedPublic

Assigned To
Authored By
brouberol
Apr 11 2025, 10:15 AM
Referenced Files
F59377674: Screenshot 2025-04-24 at 14.35.03.png
Apr 24 2025, 12:35 PM
F59377551: Screenshot 2025-04-24 at 14.22.03.png
Apr 24 2025, 12:23 PM
F59377539: Screenshot 2025-04-24 at 14.20.29.png
Apr 24 2025, 12:23 PM
F59375312: Screenshot 2025-04-24 at 10.33.05.png
Apr 24 2025, 8:34 AM
Restricted File
Apr 18 2025, 9:38 AM
F59288522: Screenshot 2025-04-18 at 11.34.32.png
Apr 18 2025, 9:35 AM
F59127807: Screenshot 2025-04-15 at 12.29.52.png
Apr 15 2025, 10:30 AM
F59126063: Screenshot 2025-04-15 at 12.25.18.png
Apr 15 2025, 10:30 AM

Description

We currently have core.parallelism set to 64, so we should theoretically be able to execute 64 dump tasks at the same time.

However, we will still be limited by:

  • the amount of pool slots (adjustable at runtime)
  • the DAGs max_active_tasks parameter (currently set to 16)

The kubernetes_executor.worker_pods_creation_batch_size being set to 16 means that we won't be able to create the 64 pods in a single batch, but that's probably fine, as the pods will simply be delayed in the executor queue for a bit.

Same for the max_tis_per_query being set to 16. That will probably delay the scheduling of some tasks by a couple of scheduler loops. We should see whether we can increase it, though.

Event Timeline

brouberol triaged this task as Medium priority.

This is our baseline of running dumps for 32 wikis with max_active_tasks=16: it takes about 1h20min to process through 99% of the tasks.

NOTE: 2 wikis, afwiki and anwiki are outliers in that batch, as their articlesdump task takes a lot of time compared to the 30 others.

Screenshot 2025-04-15 at 11.29.23.png (1×2 px, 503 KB)

{F59111465}

FWIW, the dump process puts a bit of CPU pressure on the k8s masters, but nothing dramatic.

Screenshot 2025-04-15 at 11.30.22.png (712×1 px, 157 KB)
Screenshot 2025-04-15 at 11.30.12.png (904×2 px, 294 KB)

We're interested in the ability of increasing our scheduling throughput, so the first step is to increase max_active_tasks from 16 to 32. This might collide with the scheduler.max_tis_per_query=16 airflow config, especially as we increase the wikis count. We'll see. Same remark for kubernetes_executor.worker_pods_creation_batch_size: 16.

To prepare for 32 concurrent tasks executions, I'm going to also increase the mediawiki-dumps-legacy-regular pool slot count to 32.

I'm seeing a lot of

Forbidden: exceeded quota: quota-compute-resources, requested: limits.memory=4896Mi, used: limits.memory=150676Mi, limited: limits.memory=150Gi"

If we want to run at least 32 parallel dump jobs, we need to either reduce the ns memory limit, or increase the quota.

Looking at the following graph, we seem to be requesting a lot of CPUs.

Screenshot 2025-04-15 at 12.25.18.png (1×1 px, 365 KB)

And ideed, for each dump task, we actually trigger 2 pods:

  • the KubernetesPodOperator, in charge of creating and managing the actual dump pod
  • the actual dump pod

The first one is created with the same resource as any other pod task, ie

worker:
  resources:
    requests:
      cpu: 1000m
      memory: 1500Mi
    limits:
      cpu: 2000m
      memory: 3Gi

which is waaay too much for following kubernetes events and sending an API request.

Looking at the pod details dashboard, we see that the pod barely uses any CPU at all, and about 300MB of memory.

Screenshot 2025-04-15 at 12.29.52.png (1×2 px, 377 KB)

We need to adjust these request/limits if we want to make a better use of the available resources.

Change #1136706 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] airflow: ensure the pod running in the KubernetesPodOperator itself gets low resources

https://gerrit.wikimedia.org/r/1136706

Change #1136706 merged by Brouberol:

[operations/deployment-charts@master] airflow: ensure the pod running in the KubernetesPodOperator itself gets low resources

https://gerrit.wikimedia.org/r/1136706

Change #1136726 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] airflow: hotfix: only assign low resources to kubernetes pod operator pods

https://gerrit.wikimedia.org/r/1136726

Change #1136726 merged by Brouberol:

[operations/deployment-charts@master] airflow: hotfix: only assign low resources to kubernetes pod operator pods

https://gerrit.wikimedia.org/r/1136726

brouberol reopened this task as In Progress.

I've run a dump of 64 wikis, with 32 concurrent tasks. I aborted it as I was seeing many memory limits quota errors, as show on the following graph:

Screenshot 2025-04-18 at 11.34.32.png (1×2 px, 513 KB)

We'll probably need to increase the memory limit to 200GB if we want to be able to run 32 parallel dumps.

By the way, our effort to reduce resource requests/limits cluster-wide were successful! By downsizing 2 flink apps from 20 to 2 (which should have been applied months ago but was not), as well as reducing the amount of resources allotted to the canary events, we've been able to significantly increase available resources: {F59288535}

Screenshot 2025-04-24 at 10.33.05.png (108×2 px, 60 KB)

We can get to 32 parallel dump tasks by setting kubernetes_executor.worker_pods_creation_batch_size: 32 and max_active_tasks: 32 on the DAG, as well as 32 pool slots.

We've been able to dump 64 wikis while running 32 concurrent jobs and pretty much saturating the associated pool, in about 3h.

Screenshot 2025-04-24 at 14.20.29.png (1×2 px, 798 KB)

Screenshot 2025-04-24 at 14.22.03.png (542×2 px, 440 KB)

Change #1138739 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] airflow-test-k8s: allow 32 pods to be created in a single executor batch

https://gerrit.wikimedia.org/r/1138739

Thanks to the new quotas, we never even approached resource quota saturation.

Screenshot 2025-04-24 at 14.35.03.png (1×3 px, 810 KB)

Change #1138739 merged by Brouberol:

[operations/deployment-charts@master] airflow-test-k8s: allow 32 pods to be created in a single executor batch

https://gerrit.wikimedia.org/r/1138739

We've been able to increase parallelism to 32, and be pretty efficient at saturating the pool. I'm considering this task done.