Page MenuHomePhabricator

Dumps documentation: revise and improve landing pages and navigation
Open, Stalled, LowPublic

Description

https://dumps.wikimedia.org/ and https://meta.wikimedia.org/wiki/Data_dumps seem to be the primary landing pages for Dumps as a product/service. These pages could be improved to be more effective and user-friendly.

Without having done an in-depth review, the major opportunities I see here are:

  • Choose one page/platform to serve as the landing page for dumps, and make that landing page follow best practices like providing task-based navigation, organizing content into easily-consumable units, and using progressive disclosure to guide users to more detailed information. https://meta.wikimedia.org/wiki/Data_dumps is providing links to a lot of great information, but it is overwhelming and should be audited to make sure it's linking to reliable and relevant resources.
  • Consolidate duplicate information scattered on multiple pages/wikis/platforms
  • Restructure pages and add navigation based on audience and critical user journeys / data tasks. Figure out a way to provide clear and consistent navigation between all the different places where there's dump related content, like...

https://www.mediawiki.org/wiki/SQL/XML_Dumps
Pages linked from https://meta.wikimedia.org/wiki/Data_dumps/More_resources
https://gitlab.wikimedia.org/repos/research/html-dumps/-/blob/main/README.md
Pages at https://meta.wikimedia.org/wiki/Category:Data_dumps
https://wikitech.wikimedia.org/wiki/Dumps

Related docs tasks (that I'm aware of; there are likely more that I didn't yet find):
T193296: Consolidate and improve data usage documentation for WMF-generated data
T343146: Create an Introduction to Wikimedia open data

Event Timeline

TBurmeister changed the task status from Open to Stalled.Nov 15 2023, 6:12 PM
TBurmeister lowered the priority of this task from Medium to Low.

Status note: The tech docs team started working on data documentation in Q2 2023/24, and that work may intersect with this. However, we need to align with the project timeline and goals for Dumps 2.0, so prioritizing docs work may be blocked by that larger project. We don't want to allocate time/resources to improve documentation if we know there are significant upcoming changes.

@TBurmeister Thanks for gathering all this great info. We are gearing up to pick back up work on Dumps 2.0 next sprint (starting next week). It looks like a good time for you and I to talk about possible timelines for the docs side.