Details
- Other Assignee
- brouberol
| Title | Reference | Author | Source Branch | Dest Branch | |
|---|---|---|---|---|---|
| airflow-analytics: inject the CLASSPATH env variable into the environment | repos/data-engineering/airflow-dags!1040 | brouberol | T380619 | main |
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | brouberol | T362788 Migrate Airflow to the dse-k8s cluster | |||
| Resolved | brouberol | T364389 Migrate the airflow scheduler components to Kubernetes | |||
| Resolved | brouberol | T380619 Migrate the airflow-analytics scheduler to Kubernetes | |||
| Resolved | amastilovic | T386282 Migrate analytics Airflow DAGs to k8s Airflow deployment | |||
| Resolved | brouberol | T389172 Decommission airflow-analytics |
Event Timeline
Migration notes here: https://etherpad.wikimedia.org/p/airflow-analytics-migration
We have created a list of all un-paused jobs with:
curl -X 'GET' 'http://localhost:8600/api/v1/dags?limit=200&only_active=true&paused=false' -H 'accept: application/json' |jq -r '.dags[].dag_id' > all_unpaused_dags_T380619.txt curl -X 'GET' 'http://localhost:8600/api/v1/dags?offset=100&only_active=true&paused=false' -H 'accept: application/json' |jq -r '.dags[].dag_id' > all_unpaused_dags_T380619_2.txt cat all_unpaused_dags_T380619.txt all_unpaused_dags_T380619_2.txt > all_unpaused_dags_T380619_combined.txt
For some reason the API wasn't working with the limit=200 parameter, so I had to use an offset=100 to get the second part of thelist, then concatenate them.
I also un-paused the canary events DAG, because we want to keep this going as much as possible.
Change #1113101 had a related patch set uploaded (by Btullis; author: Btullis):
[operations/puppet@production] Temporarily disable gobblin timers on an-launcher1002
Change #1113101 merged by Btullis:
[operations/puppet@production] Temporarily disable gobblin timers on an-launcher1002
Change #1113108 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow-analytics: migrate scheduler and database to Kubernetes
Change #1113108 merged by Brouberol:
[operations/deployment-charts@master] airflow-analytics: migrate scheduler and database to Kubernetes
Icinga downtime and Alertmanager silence (ID=9f1d8f2e-4415-45fe-b65f-85692fbd29f5) set by btullis@cumin1002 for 2:00:00 on 1 host(s) and their services with reason: Migrating to kubernetes
an-launcher1002.eqiad.wmnet
Icinga downtime and Alertmanager silence (ID=758003f6-c030-40a2-8737-def8016b0655) set by btullis@cumin1002 for 4:00:00 on 1 host(s) and their services with reason: Migrating to kubernetes
an-launcher1002.eqiad.wmnet
Mentioned in SAL (#wikimedia-analytics) [2025-01-21T13:24:04Z] <btullis> stopped airflow services on an-launcher1002 for T380619
Change #1113136 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow-analytics: fix DB cluster size
Change #1113136 merged by jenkins-bot:
[operations/deployment-charts@master] airflow-analytics: fix DB cluster size
Change #1113145 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow-analytics: remove import configuration
Change #1113145 merged by Brouberol:
[operations/deployment-charts@master] airflow-analytics: remove import configuration
Change #1113149 had a related patch set uploaded (by Btullis; author: Btullis):
[operations/deployment-charts@master] airflow-analytics: Allow access to the mw-api via service mesh
Change #1113151 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/puppet@production] global_config: add the IP of the dyna proxy
Change #1113151 merged by Brouberol:
[operations/puppet@production] global_config: add the IP of the dyna proxy
Change #1113159 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow-analytics: allow the egress to ATS for task pods
Change #1113159 merged by Brouberol:
[operations/deployment-charts@master] airflow-analytics: allow the egress to ATS for task pods
brouberol opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1040
Draft: airflow-analytics: inject the CLASSPATH env variable into the environment
Change #1113172 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow: add missing airflow.worker.extra-config-volumes
Change #1113172 merged by Brouberol:
[operations/deployment-charts@master] airflow: add missing airflow.worker.extra-config-volumes
brouberol merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1040
airflow-analytics: inject the CLASSPATH env variable into the environment
Change #1113176 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/puppet@production] Revert "global_config: add the IP of the dyna proxy"
Change #1113176 merged by Brouberol:
[operations/puppet@production] Revert "global_config: add the IP of the dyna proxy"
Change #1113198 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow: DRY extra volume mounts
Follow up from today's migration conversation.
mw_content_reconcile_mw_content_history_daily and other mediawiki_content DAGs currently hit public endpoints like https://noc.wikimedia.org/conf/dblists/open.dblist to generate their dynamic tasks.
Do we know of an internal equivalent for https://noc.wikimedia.org?
Change #1113198 merged by Brouberol:
[operations/deployment-charts@master] airflow: DRY extra volume mounts
Change #1115855 had a related patch set uploaded (by Brouberol; author: Brouberol):
[operations/deployment-charts@master] airflow: mirror datahub configuration from airflow hosts
Change #1115855 merged by Brouberol:
[operations/deployment-charts@master] airflow: mirror datahub configuration from airflow hosts
I'm removing myself as the assignee of this ticket, as I'll be out on leave for a couple of weeks. Someone else may claim the ticket in the meantime.
Do we know of an internal equivalent for https://noc.wikimedia.org?
Responding for posterity's sake. We have deployed a service mesh envoy proxy pod running alongside airflow. To reach out to https://noc.wikimedia.org from within Kubernetes, you can reach out to http://envoy:6509. @amastilovic has defined https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/blob/c2cc935e38759409ce9a87a77ad5ae25222af09f/wmf_airflow_common/util.py#L201 to help us with the URL mapping.
Change #1113149 abandoned by Btullis:
[operations/deployment-charts@master] airflow-analytics: Allow access to the mw-api via service mesh
Reason:
No longer required.