It's common and useful to organize Wikimedia data into high-level categories. Here are some examples.
@TBurmeister's public data introduction uses
- Traffic and readership
- Content
- Contributions and contributors
The database breakdown for Iceberg tables, which started with a clear plan but has also seen some organic expansion, uses:
- Contributors
- Mediawiki
- Readership
- Traffic
- Wikidata
- Product
- Dumps (see T347611)
- Data Ops
The RDS data glossary uses:
- Content
- Reader
- Contributor
- Diversity
The dataset documentation pages on Wikitech use:
- Content
- Edits
- Events
- Traffic
There's a lot of similarity, but a surprising amount of diversity both in the choice of domains and the terminology for the same domain (for example, is it "contributions", "editing", or "edits"?). Clearly, things would be a lot simpler if we sat down and agreed on a standard set.