Right now, it's hard to figure out how to access the various sources of raw data (public and private) about Wikimedia projects. We should change that.
= Current documentation =
* [meta:Research:Data](https://meta.wikimedia.org/wiki/Research:Data)
* Main entry point for public data, but very out of date.
* [wikitech:Analytics/Data access](https://wikitech.wikimedia.org/wiki/Analytics/Data_access)
* Main entry point for internal/private/production cluster data.
* Guides written by particular teams:
* [meta:Discovery/Analytics](https://meta.wikimedia.org/wiki/Discovery/Analytics)
* [Wikimedia DE analyst document](https://docs.google.com/document/d/1lSP4aamtkv1XI5euC1NAGWeci9M39ZiXkkmB9vCwmUE/edit)
* Readers team docs about data access (focusing on [Hive queries](https://docs.google.com/document/d/1vjiobQ9kPP2Pez021GG3TYurkSYiJR6p5JZVmB-ABlE/edit) and [EventLogging queries](https://docs.google.com/document/d/1RbTZONK3Pf8e9tUmdGd0iTXX9-U9Kh7tqenAR0xbQdY/edit#heading=h.hzqi44p7vgl4))
* Some more inspiration: [mw:Wikimedia Discovery/Team/Analyst onboarding](https://www.mediawiki.org/wiki/Wikimedia_Discovery/Team/Analyst_onboarding)
* //others?//
= Proposed structure =
* [meta:Research:Data](https://meta.wikimedia.org/wiki/Research:Data)
* Continues as the main entry point for public data, with a pointer to the private data entry point.
* [meta:Research:Private data](https://meta.wikimedia.org/wiki/Research:Private_data)
* New main entry point for private data. Content moved here from [wikitech:Analytics/Data access](https://wikitech.wikimedia.org/wiki/Analytics/Data_access). Explains what you might use private data for, how you would get access, and why that's hard to do.
* //Should the main organizing principle be the topic of the data (e.g. editing patterns or article content) or the access method (e.g. the API or the dumps)?//