Page MenuHomePhabricator

[Airflow] Migrate pageview-related Druid loading Oozie jobs
Closed, ResolvedPublic5 Estimated Story Points

Description

We have 3 Oozie jobs that load pageview data to Druid:

  1. Pageview druid hourly
  2. Pageview druid daily
  3. Pageview druid monthly

We can group them in 1 DAG file with:

  • hurly DAG
  • daily DAG
  • monthly DAG

Details

TitleReferenceAuthorSource BranchDest Branch
Migrate pageview druid loading jobs to airflowrepos/data-engineering/airflow-dags!365ebysansT334104_pageview_druidmain
Customize query in GitLab

Event Timeline

mforns set the point value for this task to 5.Apr 5 2023, 3:22 PM

We should add the referer information to the Druid datasources as requested in https://phabricator.wikimedia.org/T331028 !

Change 910520 had a related patch set uploaded (by Snwachukwu; author: Snwachukwu):

[analytics/refinery@master] Migrate pageview druid load hql queries to Airflow

https://gerrit.wikimedia.org/r/910520

Change 910520 merged by Snwachukwu:

[analytics/refinery@master] Migrate pageview druid load hql queries to Airflow

https://gerrit.wikimedia.org/r/910520