The goal is:
- and fix breaking changes
- to avoid deprecation warnings.
- add datahub kafka test connection config to puppet
Notice: When switching between versions, we need to run airflow db upgrade.
The goal is:
Notice: When switching between versions, we need to run airflow db upgrade.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | Antoine_Quhen | T326193 Airflow upgrade (refactor deb creation + version bump + switch to PostgreSQL) | |||
Resolved | Stevemunene | T315580 Upgrade Puppet code to make Airflow configuration files compatible with version 2.5.0 |
Change 827526 had a related patch set uploaded (by Snwachukwu; author: Snwachukwu):
[operations/puppet@production] Update Puppet files for Airflow Upgrade to 2.3.2
We should make sure the last version of the airflow deb is not shipping this version of zlib: zlib 1.2.12 h7f8727e_1 The ..._1 upload is no more on the conda forge.
Currently, when cloning env, we get:
CondaHTTPError: HTTP 404 NOT FOUND for url <https://repo.anaconda.com/pkgs/main/linux-64/zlib-1.2.12-h7f8727e_1.conda>
Tmp fix is to copy the missing file:
mkdir -p /tmp/aqu2/.conda/envs cp -R ~/.conda/envs/airflow_development /tmp/aqu2/.conda/envs/ chmod -R 777 /tmp/aqu2 sudo -u analytics-privatedata ./run_dev_instance.sh -m /tmp/aqu2 analytics-test
Hey all - Just a note from the Security-Team and @MoritzMuehlenhoff - Airflow should be bumped to 2.3.4 so as to avoid introducing https://nvd.nist.gov/vuln/detail/CVE-2022-38054.
In addition 2.3.4 will also address CVE-2022-38170: https://www.openwall.com/lists/oss-security/2022/09/02/3
Change 867668 had a related patch set uploaded (by Aqu; author: Aqu):
[operations/puppet@production] Use Airflow 2.4.3 + Postgres in test-cluster
@Stevemunene This should be doable with minimal puppet code changes, I believe only hiera data changes are needed.
Change 878128 had a related patch set uploaded (by Stevemunene; author: Stevemunene):
[operations/puppet@production] Update analytics_text conf compatibility with airflow2.3.4 connect postgresql
Here are the last modifications to add to the airflow configuration in the puppet code.
Configuration changes to airflow.cfg:
# Rename dag_concurrency to max_active_tasks_per_dag # And remove sql_alchemy_conn + load_default_connections [core] # sql_alchemy_conn = mysql://airflow_data_engineering_dev:password@an-db1001.eqiad.wmnet:5432/airflow_data_engineering_dev # load_default_connections = False # dag_concurrency = 6 max_active_tasks_per_dag = 6 # Move 2 parameters [database] from [core] [database] sql_alchemy_conn = postgresql://airflow_data_engineering_dev:password@an-db1001.eqiad.wmnet:5432/airflow_data_engineering_dev load_default_connections = False # Rename auth_backend to auth_backends with an `s` [api] #auth_backend = airflow.api.auth.backend.default auth_backends = airflow.api.auth.backend.default # New block to add [datahub] enabled = False conn_id = datahub_kafka_test cluster = test
Configuration changes in connections.yml:
analytics-test-hive: conn_type: hive_metastore host: analytics-test-hive.eqiad.wmnet port: 9083 extra_dejson: # Rename authMechanism to auth_mechanism auth_mechanism: GSSAPI # Add the following connection datahub_kafka_test: conn_type: datahub_kafka host: kafka-test1006.eqiad.wmnet:9092
Change 887735 had a related patch set uploaded (by Btullis; author: Btullis):
[labs/private@master] Add some dummy tokens for the airflow_test database
Change 887735 merged by Btullis:
[labs/private@master] Add some dummy tokens for the airflow_test database
We were able to recreate the airflow.cfg and connections.yml as described in the comment above. Enlisted help on the password/secrets management for psql connection and were able to solve it. Working on refactoring the puppet code to avail this in a cleaner manner as per review.
Puppet code updated to provide airflow version compatible config based on provided airflow version. This shall be updated once all instances are on the same upgraded airflow version.
Change 867668 abandoned by Ottomata:
[operations/puppet@production] Use Airflow 2.4.3 + Postgres in test-cluster
Reason:
Work being done in https://gerrit.wikimedia.org/r/c/operations/puppet/+/878128/36..48
Change 878128 merged by Nicolas Fraison:
[operations/puppet@production] Update airflow conf compatibility with airflow 2.5.0 connect postgresql