Page MenuHomePhabricator

Make history and current wikitext available in hadoop
Closed, ResolvedPublic5 Estimated Story Points

Description

  • Add oozie job for pages_meta_current (dump imported but not converted to avro - copy of the job from pages_meta_history)
  • Bump docs (add pages_meta_current and siteinfo datasets, and add examples on how to use)

Event Timeline

mforns triaged this task as Medium priority.Nov 25 2019, 5:12 PM
mforns moved this task from Incoming to Smart Tools for Better Data on the Analytics board.

Change 565558 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Update wikitext oozie job adding current

https://gerrit.wikimedia.org/r/565558

Change 565554 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Enforce distinct revision in xml-dumps converter

https://gerrit.wikimedia.org/r/565554

Change 565554 merged by jenkins-bot:
[analytics/refinery/source@master] Enforce distinct revision in xml-dumps converter

https://gerrit.wikimedia.org/r/565554

JAllemandou updated the task description. (Show Details)
JAllemandou set the point value for this task to 5.

Change 565558 merged by Joal:
[analytics/refinery@master] Update wikitext oozie job adding current

https://gerrit.wikimedia.org/r/565558

JAllemandou renamed this task from Update wikitext-processing on hadoop various aspects to Make history and current wikitext available in hadoop.Feb 5 2020, 8:34 PM