Page MenuHomePhabricator

Re-enable drop_old_data_daily in airflow-search
Closed, ResolvedPublic3 Estimated Story Points

Description

The dag drop_old_data_daily has been disabled while migrating the airflow scheduler to k8s, we should adapt it (probably by updating the --execute hash) and re-enable it.

AC:

  • drop_old_data_daily is activated and running
  • improve script output so that dropping external data isn't missed if it is needed

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
search: update drop_old_data checksumsrepos/data-engineering/airflow-dags!1160dcausseT386097-search-update-drop-old-data-checksumsmain
Customize query in GitLab

Event Timeline

Gehel set the point value for this task to 3.

@dcausse I had a look at the tables managed by this DAG, and none is declared as EXTERNAL. The dag (correctly) drops both partitions from Hive metastore, and data from hdfs.