Page MenuHomePhabricator

Support queries against Quarry's own database and ToolsDB
Closed, ResolvedPublic

Description

Add a way to specify the database server to attach to so that queries can be written against:

  • T348407 The tool owned databases in ToolsDB
  • T367415 The "local" database that is used to track Quarry's saved queries. This way you can for example find scripts that query specific tables with LIKE "%tablename%"

Event Timeline

Framawiki renamed this task from Query Quarry's own database to Query Quarry's own database and tools one.Jun 8 2018, 5:30 PM
bd808 renamed this task from Query Quarry's own database and tools one to Support queries against Quarry's own database and ToolsDB.Aug 2 2018, 11:02 PM
bd808 updated the task description. (Show Details)
Bstorm subscribed.

This would be trivial to add to the work on the parent task I just added. Therefore, adding it.

I dug around a little bit and realized it isn't as simple as what I stated above. The reason is that those are read-write databases, which dramatically changes the scope of what Quarry could do. In order to provide such services, this would require a read-only permission to most tables on ToolsDB. It would also require a review of and possible view layer/read-only role for the Quarry database.

This cannot be fitted into the scope of the replica changes. It also may be surprising for developers if their ToolsDB information became dramatically more public than it currently is, but that's more of a topic for discussion.

aborrero triaged this task as Low priority.

There's a clear path to adding database patterns to talk to toolsdb. The quarry database is how quarry works in dev mode, so it clearly works. However, I think it should be done with the read-only user used elsewhere. That's all pretty easy. It should also probably be done in two stages (one patch for toolsdb and one for self-query) so that we can roll back something that is found to be problematic.

I think letting users query public toolsdb databases has clear value (but not sure of the utility of querying quarry's internal db).

As the database selector field exists now, we could just check if the dbname has double underscores and ends in _p – that should indicate a tools database. Credentials used for connecting must not have write access to any database.

With PR-26 closed, should this task be closed as resolved?

fnegri changed the task status from Open to In Progress.Jun 13 2024, 10:02 AM
fnegri claimed this task.
fnegri raised the priority of this task from Low to Medium.

We are finally close to resolving this task! After the work in T348407 I can successfully query ToolsDB from Quarry. Access at the moment is limited to one database for testing (s55771__wsstats_p). As discussed in T348407 I will send an email to cloud-announce to inform everyone that we're opening this type of access to the ToolsDB databases (only for the databases ending with _p).

I will give people a 2-week notice for this change, and enable access to all _p databases on Monday, July 1st. If you would like to have access to a specific database that you are managing and would like to have it enabled before that date, just leave a comment below.

I also created T367393: Allow Superset to query ToolsDB public databases to add a similar type of access from Superset.

This task originally mentioned both access to ToolsDB and access to "The 'local' database that is used to track Quarry's saved queries". Is access to Quarry's own database something that anyone is still interested in? Edit: I split this requirement into its own task T367415: Allow Quarry to query its own database

All the subtasks are now completed and you can use Quarry to query ToolsDB's public dbs and Quarry's internal db.

I will keep this task open until I update the related documentation pages.

fnegri moved this task from In progress to Done on the cloud-services-team (FY2024/2025-Q1-Q2) board.

I have updated all the documentation I could find, let me know if I missed anything:

I am marking this task as Resolved, after 2,873 days from its creation. 😺

Hello @fnegri I think this improvement is worth announcing in Tech News (unless you believe otherwise).
If it is worth announcing, what wording would you suggest as the content, and When should it be included? Thanks!

@UOzurumba apologies for the delay. Yes, it's a good idea to add it to the next Tech News! Here's a summary, feel free to edit it:

ToolsDB's public databases are now accessible from both Quarry and Superset.
Those databases have always been accessible to every Toolforge user, but they are now more broadly accessible, as Quarry can be accessed by anyone with a Wikimedia account.
In addition, Quarry's internal database can now be queried from Quarry itself. This database contains information about all queries that are being run and starred by users in Quarry. This information was already public through the web interface, but you can now query it using SQL.