Page MenuHomePhabricator

Migrate the airflow-analytics-product scheduler to Kubernetes
Closed, ResolvedPublic

Event Timeline

Gehel triaged this task as High priority.Nov 25 2024, 1:33 PM
brouberol changed the task status from Open to In Progress.Feb 25 2025, 1:55 PM
brouberol claimed this task.
brouberol@stat1008:~$ s3cmd --access_key=$access_key --secret_key=$secret_key --host=rgw.eqiad.dpe.anycast.wmnet --region=dpe --host-bucket=no mb s3://logs.airflow-analytics-product.dse-k8s-eqiad
Bucket 's3://logs.airflow-analytics-product.dse-k8s-eqiad/' created

Change #1122591 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] airflow-analytics-product: migrate the scheduler and the DB to Kubernetes

https://gerrit.wikimedia.org/r/1122591

Change #1122592 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/puppet@production] airflow-analytics-product: disable and remove the airflow systemd services

https://gerrit.wikimedia.org/r/1122592

Change #1122591 merged by Brouberol:

[operations/deployment-charts@master] airflow-analytics-product: migrate the scheduler and the DB to Kubernetes

https://gerrit.wikimedia.org/r/1122591

Change #1122944 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] airflow-analytics-product: add missing database value

https://gerrit.wikimedia.org/r/1122944

Change #1122944 merged by Brouberol:

[operations/deployment-charts@master] airflow-analytics-product: add missing database value

https://gerrit.wikimedia.org/r/1122944

Change #1122592 merged by Brouberol:

[operations/puppet@production] airflow-analytics-product: disable and remove the airflow systemd services

https://gerrit.wikimedia.org/r/1122592

The flagged_revisions_pending_hourly DAG executed on Kubernetes from start to end without any issue. Given that this is the only non-daily/monthly DAG, I'm going to move this ticket in our Blocked/Waiting column until we get more results tomorrow. Onwards!

Change #1123289 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/puppet@production] Define the analytics-web service

https://gerrit.wikimedia.org/r/1123289

Change #1123290 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/puppet@production] envoy: add the analytics-web service to the mesh

https://gerrit.wikimedia.org/r/1123290

Change #1123300 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/puppet@production] an-web: enable traffic to port 8443 from the dse-k8s kubeernetes cluster

https://gerrit.wikimedia.org/r/1123300

Change #1123300 merged by Brouberol:

[operations/puppet@production] an-web: enable traffic to port 8443 from the dse-k8s kubernetes cluster

https://gerrit.wikimedia.org/r/1123300

brouberol opened https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1092

analytics_product: connect to analytics.wikimedia.org via the service mesh

Change #1123308 had a related patch set uploaded (by Brouberol; author: Brouberol):

[operations/deployment-charts@master] analytics-product: enable traffic to analytics-web listener

https://gerrit.wikimedia.org/r/1123308

Change #1123289 merged by Brouberol:

[operations/puppet@production] Define the analytics-web service

https://gerrit.wikimedia.org/r/1123289

Change #1123290 merged by Brouberol:

[operations/puppet@production] envoy: add the analytics-web service to the mesh

https://gerrit.wikimedia.org/r/1123290

Change #1123308 merged by Brouberol:

[operations/deployment-charts@master] analytics-product: enable traffic to analytics-web listener

https://gerrit.wikimedia.org/r/1123308

The automoderator_monitoring_snapshot_daily DAG failed last night, due to https://analytics.wikimedia.org not being reachable from the airflow instance. We're solving this issue ny making the service reachable through the service mesh.

brouberol merged https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/1092

analytics_product: connect to analytics.wikimedia.org via the service mesh

All daily/weekly/monthly DAGs have successfully run in the last days (and monthly DAGs have run on March 1st). Let's close!