Page MenuHomePhabricator

Refine to Hive with Airflow – Post-Migration Cleanup
Closed, ResolvedPublic

Description

Following the migration, perform the following cleanup tasks after the transition period (~3 weeks, including the first day of a month):

  • Stop the old Refine systemd jobs (refine_event & refine_eventlogging_legacy) + the refine_monitor jobs.
  • Drop event_systemd db including hdfs data
  • Remove the additional generation of Canary events now made redundant by the new pipeline. We don't change it now.
  • Drop diff task in refine_to_hive_hourly (+hdfs dependencies)

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
analytics-test: Refine remove diffrepos/data-engineering/airflow-dags!1568aquT369845_analytics_test_refine_remove_diffmain
analytics-test: Refine - Remove diff with legacy partitionrepos/data-engineering/airflow-dags!1559aquT369845_refine_analytics_test_post_migration_fixesmain
Customize query in GitLab

Event Timeline

Ahoelzl removed Antoine_Quhen as the assignee of this task.
Antoine_Quhen changed the task status from Open to In Progress.Jul 18 2025, 1:01 PM
Antoine_Quhen updated the task description. (Show Details)

Change #1180149 had a related patch set uploaded (by Aqu; author: Aqu):

[operations/puppet@production] analytics: Refine remove systemd job

https://gerrit.wikimedia.org/r/1180149

Change #1180149 merged by Stevemunene:

[operations/puppet@production] analytics: Refine remove systemd job

https://gerrit.wikimedia.org/r/1180149