Page MenuHomePhabricator

[Migration] Oozie Migration jobs for Pageviews
Closed, ResolvedPublic5 Estimated Story Points

Description

Migrate Oozie pageview hourly job to Airflow

I think we can leave the historical jobs behind as they hopefully should not be needed - can you confirm @Milimetric and @mforns ? -- Confirmed ~~~~

Event Timeline

JArguello-WMF renamed this task from [Migration] Ozzie Migration jobs for Pageviews to [Migration] Oozie Migration jobs for Pageviews.Dec 5 2022, 4:56 PM
JArguello-WMF set the point value for this task to 5.

Let's make sure we tripple-check when vetting the data :]

And take time to validate that the used UDFs work well in Spark's multi-thread context!

I'm working on a merge request for this, testing the jobs (it's going slow 'cause I'm on ops week)

Change 887869 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/refinery@master] Migrate pageview_hourly and related jobs

https://gerrit.wikimedia.org/r/887869

I'm putting this in review, but there are three jobs being migrated so I'll send them in separate patches.

Change 889532 had a related patch set uploaded (by Milimetric; author: Milimetric):

[analytics/refinery@master] Migrate pageview dumps jobs

https://gerrit.wikimedia.org/r/889532

Change 887869 merged by Milimetric:

[analytics/refinery@master] Migrate pageview_hourly and related jobs

https://gerrit.wikimedia.org/r/887869