Page MenuHomePhabricator

ETL pipeline for unpatrolled recentchanges daily activity
Closed, ResolvedPublic

Description

Baseline: https://nbviewer.org/urls/gitlab.wikimedia.org/kcvelaga/automoderator-measurement/-/raw/main/baselines/T348863_content_moderation_backlogs_rchanges.ipynb

Draft table schema

`wiki_db` string COMMENT 'wiki db name',
`date` date COMMENT 'The paritition date for which the metric is computed over.',
`rc_date` date COMMENT 'The date for which the recentchanges status is aggegated for, usually 15 days prior to the paritition date.', 
`is_ns0` boolean COMMENT 'Indicates whether the log is on namespace zero or not.',
`is_page_creation` boolean COMMENT 'Indicates whether the revision resulted in page creation.',
`patrol_status` string COMMENT 'Patrol status of the change, possible values: autopatrolled, patrolled, unpatrolled.',
`n_revisions` int COMMENT 'Number of revisions for the given dimensions.'

Event Timeline

KCVelaga_WMF triaged this task as Medium priority.
KCVelaga_WMF renamed this task from ETL pipeline for recentchanges daily activity to ETL pipeline for upatrolled recentchanges daily activity.Jun 25 2024, 5:08 AM
KCVelaga_WMF updated the task description. (Show Details)

The pipeline is running, and the data is available at wmf_product.moderation_unpatrolled_recentchanges_daily

KCVelaga_WMF renamed this task from ETL pipeline for upatrolled recentchanges daily activity to ETL pipeline for unpatrolled recentchanges daily activity.Sep 13 2024, 7:00 AM