siteinfo dumps contain, for each wiki, magic-words aliases, meaning multi-language keywords of wikitext (for instance REDIRECTION in French for REDIRECT). Importing these files and possibly transforming them onto an easier-to-query structure (parquet based) is a necessary step toward productionizing historical redirects extraction.
Description
Details
Related Objects
Event Timeline
Change 540124 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Add site-info dump type to importer
Change 540124 merged by Nuria:
[analytics/refinery@master] Add site-info dump type to importer
Change 546966 had a related patch set uploaded (by Joal; owner: Joal):
[operations/puppet@production] [WIP] Refactor profile::analytics::refinery::job::import_mediawiki_dumps
Change 547169 had a related patch set uploaded (by Joal; owner: Joal):
[analytics/refinery@master] Update oozie datasets to match dumps import change
Change 546966 merged by Elukey:
[operations/puppet@production] Refactor profile::analytics::refinery::job::import_mediawiki_dumps
Change 547169 merged by Joal:
[analytics/refinery@master] Update oozie datasets to match dumps import change