- Add oozie job for pages_meta_current (dump imported but not converted to avro - copy of the job from pages_meta_history)
- Bump docs (add pages_meta_current and siteinfo datasets, and add examples on how to use)
Description
Description
Details
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Milimetric | T186559 Provide data dumps in the Analytics Data Lake | |||
Resolved | JAllemandou | T238858 Make history and current wikitext available in hadoop |
Event Timeline
Comment Actions
Change 565558 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Update wikitext oozie job adding current
Comment Actions
Change 565554 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery/source@master] Enforce distinct revision in xml-dumps converter
Comment Actions
Change 565554 merged by jenkins-bot:
[analytics/refinery/source@master] Enforce distinct revision in xml-dumps converter
Comment Actions
Change 565558 merged by Joal:
[analytics/refinery@master] Update wikitext oozie job adding current
Comment Actions
It's been done a few days ago :)
https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Content/XMLDumps/Mediawiki_wikitext_current exists as well.