https://dumps.wikimedia.org/ and https://meta.wikimedia.org/wiki/Data_dumps seem to be the primary landing pages for Dumps as a product/service. These pages could be improved to be more effective and user-friendly.
Without having done an in-depth review, the major opportunities I see here are:
- Choose one page/platform to serve as the landing page for dumps, and make that landing page follow best practices like providing task-based navigation, organizing content into easily-consumable units, and using progressive disclosure to guide users to more detailed information. https://meta.wikimedia.org/wiki/Data_dumps is providing links to a lot of great information, but it is overwhelming and should be audited to make sure it's linking to reliable and relevant resources.
- Consolidate duplicate information scattered on multiple pages/wikis/platforms
- Restructure pages and add navigation based on audience and critical user journeys / data tasks. Figure out a way to provide clear and consistent navigation between all the different places where there's dump related content, like...
https://www.mediawiki.org/wiki/SQL/XML_Dumps
Pages linked from https://meta.wikimedia.org/wiki/Data_dumps/More_resources
https://gitlab.wikimedia.org/repos/research/html-dumps/-/blob/main/README.md
Pages at https://meta.wikimedia.org/wiki/Category:Data_dumps
https://wikitech.wikimedia.org/wiki/Dumps
- Standardize language for how we refer to the various data sources and content of the dumps. For example, when I look at https://meta.wikimedia.org/wiki/Data_dumps/What%27s_available_for_download I have a hard time aligning that with what I see at https://dumps.wikimedia.org/
Related docs tasks (that I'm aware of; there are likely more that I didn't yet find):
T193296: Consolidate and improve data usage documentation for WMF-generated data
T343146: Create an Introduction to Wikimedia open data