Page MenuHomePhabricator

Update the new_editors table with an Airflow job
Closed, ResolvedPublic

Description

Among the movement_metric job queries, one inserts data into wmf_product.new_editors. We should migrate it to an Airflow job.

  • Write updated SQL queries for table creation and updating
  • Confirm new query output matches old
  • Write Airflow job
  • Test job on analytics client
  • Write tests
  • Get job merged
  • Deploy job
  • Create new table (wmf_contributors.new_editor)
  • Unpause job
  • Backfill table using a duplicate-free snapshot of mediawiki_history (T369851)
  • Update metric queries to use new table
  • Document new table in DataHub

Details

Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
analytics_product: Add new_editor jobrepos/data-engineering/airflow-dags!764nshahquinn-wmfwork/nshahquinn-wmf/new_editormain
new_editor: Add create and update commandsrepos/movement-insights/sql!5nshahquinn-wmfwork/nshahquinn-wmf/new_editormain
Customize query in GitLab

Event Timeline

nshahquinn-wmf triaged this task as High priority.
nshahquinn-wmf moved this task from Incoming to Backlog on the Movement-Insights board.
nshahquinn-wmf lowered the priority of this task from High to Medium.Jul 16 2024, 6:45 PM

We will update the metric queries as part of T371651.