Page MenuHomePhabricator

Productionize item_page_link table
Closed, ResolvedPublic

Description

The item_page_link table is built by hand from wikidata site-links (json dumps), internationalized namespaces reference (wmf_raw.project_namespace_map) and page-history (wmf.mediawiki_page_history).
Let's oozify that job as data has proven usefull for the Research team and others.

Event Timeline

Change 572834 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Add wikidata item_page_link oozie job

https://gerrit.wikimedia.org/r/572834

Change 572746 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Add wikidata item_page_link spark job

https://gerrit.wikimedia.org/r/572746

Change 572834 merged by Milimetric:
[analytics/refinery@master] Add wikidata item_page_link oozie job

https://gerrit.wikimedia.org/r/572834

Change 572746 merged by jenkins-bot:
[analytics/refinery/source@master] Add wikidata item_page_link spark job

https://gerrit.wikimedia.org/r/572746

I think we need docs that point to all info that is available from wikidata on cluster, let's at least create the ones for this table, cc @JAllemandou