Page MenuHomePhabricator

Clickstream dataset documentation should be extracted from Research page
Open, Needs TriagePublic

Description

Right now, the only source of dataset documentation for the analytics clickstream dataset is https://meta.wikimedia.org/wiki/Research:Wikipedia_clickstream. The page at https://dumps.wikimedia.org/other/clickstream/readme.html points to that Meta page, and to a Figshare page that seems outdated: https://figshare.com/articles/dataset/Wikipedia_Clickstream/1305770.

The Meta page links to https://github.com/ewulczyn/wiki-clickstream which has no additional documentation and also appears outdated.

To make it clear that this is a maintained and current data source, the dataset documentation should be extracted from the historical record of the Research project that created it. It should be discoverable along with similar analytics dataset documentation, which currently is on Wikitech.