Page MenuHomePhabricator

Create a helm chart for airflow that is appropriate to our needs
Open, HighPublic

Description

We are planning to migrate Airflow to the dse-k8s cluster.

As such, we will need a helm chart for the purpose.

There is an official upstream chart for Airflow available here:
https://airflow.apache.org/docs/helm-chart/stable/index.html

We may want to consider whether or not to use this chart as it is, or whether it serves merely as inspiration.
There is a process to follow if we wish to use the chart as it is: https://wikitech.wikimedia.org/wiki/Kubernetes/Upstream_Helm_charts_policy

Details

Event Timeline

Gehel triaged this task as High priority.May 3 2024, 3:50 PM
Gehel moved this task from Incoming to Scratch on the Data-Platform-SRE board.
Gehel moved this task from Scratch to Quarterly Goals on the Data-Platform-SRE board.

Change #1034951 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] data-engineering: add scaffolding for airflow service

https://gerrit.wikimedia.org/r/1034951

Change #1034961 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] dse-k8s: add new airflow service to k8s cluster

https://gerrit.wikimedia.org/r/1034961

Change #1034951 abandoned by Bking:

[operations/deployment-charts@master] data-engineering: add scaffolding for airflow service

Reason:

forgot to add prometheus statsd support

https://gerrit.wikimedia.org/r/1034951

Change #1035013 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s: Add net-new service scaffolding for airflow

https://gerrit.wikimedia.org/r/1035013

Change #1035015 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s: add airflow namespace

https://gerrit.wikimedia.org/r/1035015

Change #1034961 merged by Bking:

[operations/puppet@production] dse-k8s: add new airflow service to k8s cluster

https://gerrit.wikimedia.org/r/1034961

Change #1037077 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] dse-k8s: add new airflow service to k8s cluster

https://gerrit.wikimedia.org/r/1037077

Change #1037077 merged by Bking:

[operations/puppet@production] dse-k8s: add new airflow service to k8s cluster

https://gerrit.wikimedia.org/r/1037077

Change #1035015 merged by Bking:

[operations/deployment-charts@master] dse-k8s: add airflow-analytics-test namespace

https://gerrit.wikimedia.org/r/1035015

Mentioned in SAL (#wikimedia-operations) [2024-06-05T13:45:31Z] <inflatador> bking@an-db1001 install acl pkg T363001

Mentioned in SAL (#wikimedia-operations) [2024-06-05T13:48:56Z] <inflatador> bking@an-db1001 install python3-psycopg2 pkg T363001

Filed a helm chart review request, although we aren't quite ready for review yet.

Change #1039260 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] an-db1001: add `airflow_test_k8s` user and db

https://gerrit.wikimedia.org/r/1039260

Change #1039260 merged by Bking:

[operations/puppet@production] an-db1001: add `airflow_test_k8s` user and db

https://gerrit.wikimedia.org/r/1039260

Change #1039838 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s: replace 'airflow-analytics-test' ns with 'airflow'

https://gerrit.wikimedia.org/r/1039838

Change #1039838 merged by jenkins-bot:

[operations/deployment-charts@master] dse-k8s: replace 'airflow-analytics-test' ns with 'airflow'

https://gerrit.wikimedia.org/r/1039838

Change #1035013 abandoned by Bking:

[operations/deployment-charts@master] dse-k8s: Add net-new service scaffolding for airflow

Reason:

starting over and using WMF scaffold

https://gerrit.wikimedia.org/r/1035013

Change #1041759 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s-services: Add net-new chart for Airflow

https://gerrit.wikimedia.org/r/1041759

bking opened https://gitlab.wikimedia.org/repos/data-engineering/airflow/-/merge_requests/7

blubber: add wmf-certificates deb pkg and work around image build failures

bking merged https://gitlab.wikimedia.org/repos/data-engineering/airflow/-/merge_requests/7

blubber: add wmf-certificates deb pkg and work around image build failures

Change #1043275 had a related patch set uploaded (by Bking; author: Bking):

[operations/deployment-charts@master] dse-k8s: harmonize airflow user/namespace/db names

https://gerrit.wikimedia.org/r/1043275

Change #1043277 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] dse-k8s: harmonize airflow user/namespace/db names

https://gerrit.wikimedia.org/r/1043277

Change #1043277 merged by Bking:

[operations/puppet@production] dse-k8s: harmonize airflow user/namespace/db names

https://gerrit.wikimedia.org/r/1043277

Change #1043275 merged by jenkins-bot:

[operations/deployment-charts@master] dse-k8s: harmonize airflow user/namespace/db names

https://gerrit.wikimedia.org/r/1043275

Change #1047189 had a related patch set uploaded (by Bking; author: Bking):

[operations/puppet@production] analytics: allow dse-k8s pod network to reach an-db1001

https://gerrit.wikimedia.org/r/1047189

Change #1047189 merged by Bking:

[operations/puppet@production] analytics: allow dse-k8s pod network to reach an-db1001

https://gerrit.wikimedia.org/r/1047189

Just as a quick progress update, we have now succeeded in getting the airflow webserver running for a test instance, using our own airflow image.

image.png (822×1 px, 76 KB)

It's a bit fiddly because the application expects to have write access within its containers, whereas we prefer to have privilege separation so that the airflow user owns the files and the runuser executes the processes.

I'm mounting an emptyDir for the logs directory and I will do the same for the gunicorn pid file, unless I can find a way to disable that.