Page MenuHomePhabricator

Adjust dump tasks resources after observing real-life data
Closed, ResolvedPublic

Description

We're currently defining 2 sets of pod resources:

# We allocate larger resources to pods in charge of dumping large wikis
LARGE_WIKI_POD_RESOURCES = k8s_client.V1ResourceRequirements(
    limits={"cpu": "4000m", "memory": "8Gi"}, requests={"cpu": "2000m", "memory": "4Gi"}
)
DEFAULT_POD_RESOURCES = k8s_client.V1ResourceRequirements(
    limits={"cpu": "1000m", "memory": "2Gi"}, requests={"cpu": "500m", "memory": "1Gi"}
)

These values were taken at random, and should be adjusted after having been able to observe real dumps in Grafana.

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Add eswiki to the HUGE_WIKIS listrepos/data-engineering/airflow-dags!1901btullisadd_eswiki_to_huge_wikismain
Dumps_v1: Apply increased KubernetesExecutor resources to sync podsrepos/data-engineering/airflow-dags!1512btullisincrease_sync_operator_pod_rammain
Dumps_v1: Override the resources for all huge wiki jump jobsrepos/data-engineering/airflow-dags!1493btullisbump_huge_dump_job_resourcesmain
Dumps_v1: Override the pod resources for huge wiki recombine jobsrepos/data-engineering/airflow-dags!1460btullisdumps_overridemain
Customize query in GitLab

Event Timeline

brouberol triaged this task as Medium priority.