Write a document, or revise https://meta.wikimedia.org/wiki/Research:Data, to cover the following concepts:
1. Introduce the major concepts data users should understand when getting started with Wikimedia open data. Define and disambiguate "content" vs "data". Explain the difference between wiki dumps and wiki replicas.
2. Explain the different types of data we publish:
- Event data (edits…)
- Analytics / "Traffic" data
-- Pageviews
-- Unique Devices
-- Clickstream data
-- Revision and user history
-- Data by country
-- Wikidata QRank
-[[ https://dumps.wikimedia.org/other/ | Other data ]]
- ORES scores? Liftwing ML outputs?
3. Map types of content and data that ppl often care about to the datasets where they live (example-based disambiguation of content vs data, plus decision tree for next steps)
4. Provide (or link to a page that provides) clear navigation to data sources and access methods. https://meta.wikimedia.org/wiki/Research:Data currently does this part pretty well.
Creating this content page will enable further edits and streamlining/simplification (reduce content duplication!) on pages like:
https://meta.wikimedia.org/wiki/Data_dumps and https://dumps.wikimedia.org/.