Write a document, or revise https://meta.wikimedia.org/wiki/Research:Data, to cover the following concepts:
- Introduce the major concepts data users should understand when getting started with Wikimedia open data. Define and disambiguate "content" vs "data". Explain the difference between wiki dumps and wiki replicas.
- Explain the different types of data we publish:
- Event data (edits…)
- Analytics / "Traffic" data
- Pageviews
- Unique Devices
- Clickstream data
- Revision and user history
- Data by country
- Wikidata QRank
- Other data
- ORES scores? Liftwing ML outputs?
- Map types of content and data that ppl often care about to the datasets where they live (example-based disambiguation of content vs data, plus decision tree for next steps)
- Provide (or link to a page that provides) clear navigation to data sources and access methods. https://meta.wikimedia.org/wiki/Research:Data currently does this part pretty well.
Creating this content page will enable further edits and streamlining/simplification (reduce content duplication!) on pages like:
https://meta.wikimedia.org/wiki/Data_dumps and https://dumps.wikimedia.org/.