Page MenuHomePhabricator

Restructure and improve content for: https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database
Open, MediumPublic

Description

This doc, https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database, is an important page with too much information in one place. It includes how-to instructions for accessing wiki replicas via Toolforge, conceptual content about database structure, and code examples.

Main issues with this doc:

  • It's confusing that this page combines info about accessing the wiki replicas databases as well as user-created "toolsdb" databases.
  • It's too long! It combines how-to steps and reference information about database schema. Information about database layout also exists in https://wikitech.wikimedia.org/wiki/Help:MySQL_queries.
  • The language-specific code examples section contains some examples of connecting to toolsdb host, and some examples of connecting to the hosting cluster for replica dbs; it would be better to have examples for each host in each language, or only include examples in the sections for wiki replicas or toolsdb instead of in a big list at the end. Even though it's relatively simple to change the host address if you're copying and pasting, this just adds to the confusion on this page.
  • Check all code examples to verify they're still correct and runnable.

To fix this issue, I suggest breaking the Help:Toolforge/Databases page into separate pages for:

  1. How to create, connect to, and query tool databases (repurpose this existing page)
  2. What are the wiki replica databases, how are they organized, including layout and available/redacted tables. (repurpose this page) | (move in some content from https://wikitech.wikimedia.org/wiki/Help:MySQL_queries)
  3. How to access and query the wiki replica databases; connection example code and sample queries; things to be aware of when writing/running queries. (move all of this how-to type content out of aforementioned docs and in to https://wikitech.wikimedia.org/wiki/Help:MySQL_queries; give the page a more accurate name).

Additional suggestions for specific improvements may be noted in the thread below.

Event Timeline

Notes for future revision:

TBurmeister added a project: Data-Services.
TBurmeister updated the task description. (Show Details)

Thanks @taavi this task has been in my radar forever, but I never got to start it.

I'm tempted to reduce https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database even further, keeping just a few paragraphs with a high-level explanation of Wiki Replicas vs User Databases, then linking to https://wikitech.wikimedia.org/wiki/Help:Wiki_Replicas and to a new page https://wikitech.wikimedia.org/wiki/Help:User_Databases (where we could mention ToolsDB but also Trove). WDYT?

I'd rather not mix ToolsDB and Trove things into the same doc, but I'd be fine with moving ToolsDB-specific things to a new page and then explaining the difference for all the different offerings on the current Help:Toolforge/Database page.

Agreed, linking to all offerings from Help:Toolforge/Database seems cleaner.

Agreed, linking to all offerings from Help:Toolforge/Database seems cleaner.

+1 hopefully with a short description on when to use each other (if you want a small DB for your tool -> ToolsDB, if you want a big DB -> Trove, kinda thing)

ok, I did a very basic split there. I'm sure there are still many places to update, but it's a start.