In order to compute per editor pageview metrics, we need a daily updated Data Lake table that allows us to lookup pages edited by a user on or before a date.
See parent T405039: Global Editor Metrics - Data Pipeline description for more detail and options.
This table can be backfilled using mediawiki_history (after T365648) and computed ongoing daily from mediawiki_content_history_v1 (via T406515).
Done is
- Hive table exists that given a user_central_id and a date, can lookup list of wiki_id,page_id pairs that the user has edited on or before that date.
- Hive table is updated daily.