Page MenuHomePhabricator

[L] Verify all Airflow DAGs after migration to Kubernetes
Closed, ResolvedPublic

Description

T380618: Migrate the airflow-platform-eng database to Kubernetes and T380624: Migrate the airflow-platform-eng scheduler to Kubernetes moved the scheduler and all tasks to Kubernetes.

Tasks

  • replace BashOperators that run hive (e.g., this one) with SparkSqlOperators or SparkSubmitOperators. See also here
  • check any DAG task that fires API requests (action API, eventstreams, etc.) because we might need to add an envoy listener for relevant DAGs
    • cassandra > check_data_and_push_metrics
    • section_topics > fetch_qids_for_all_points_in_time & fetch_qids_for_media_outlets
    • section_titles_denylist > gather_section_titles_denylist
    • verify check_bad_parsing > check_bad_parsing's output, as action API requests failures are handled

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Update image suggestions DAGs after migration to Kubernetesrepos/data-engineering/airflow-dags!1215mfossatiT390880main
Customize query in GitLab

Event Timeline

MarkTraceur renamed this task from Verify all Airflow DAGs after migration to Kubernetes to [L] Verify all Airflow DAGs after migration to Kubernetes.Apr 2 2025, 4:34 PM
mfossati changed the task status from Open to In Progress.Apr 3 2025, 9:08 AM
mfossati claimed this task.
  • ALIS succeeded
  • waiting for a full SLIS run, currently blocked - T385865#10720495.
This comment was removed by mfossati.

All successful DAGs seem to have run fine, closing.