Page MenuHomePhabricator

Establish a process for increasing a toolforge tool's connections to the wiki replicas
Open, NormalPublic

Description

We need to set some kind of process or standard for increasing a tool's database connection pool when it is warranted and necessary. This was done for the Quarry application on T180141: Raise concurrent mysql connection limit for Quarry (or throttle application concurrency), and right now, petscan is a candidate for that.

This is just to build some agreement and encourage discussion on that.

Event Timeline

Bstorm triaged this task as Normal priority.Thu, Oct 31, 12:39 AM
Bstorm created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptThu, Oct 31, 12:39 AM
Bstorm moved this task from Backlog to Wiki replicas on the Data-Services board.Thu, Oct 31, 12:42 AM

It seems clear that a Phabricator task requesting the increase and describing the need with a review would be a sensible part of things. I imagine the review should include WMCS and the DBA team, perhaps on a work board like what we have for https://phabricator.wikimedia.org/project/view/2880/

It seems clear that a Phabricator task requesting the increase and describing the need with a review would be a sensible part of things. I imagine the review should include WMCS and the DBA team, perhaps on a work board like what we have for https://phabricator.wikimedia.org/project/view/2880/

Agreed that a well documented process and a transparent workflow for triaging and responding to requests is needed. Following the model of project milestone used by Cloud-VPS (Project-requests) and Cloud-VPS (Quota-requests) for tracking the requests and their outcomes seems like a reasonable approach. Let the bike shedding commence on naming! Data-Services (Quota-requests) seems reasonable to me, but other ideas are welcome.

To add another use case (and to ping the issue):

In addition to the API scripts, Mix'n'match uses a lot of background scripts, many of which can be triggered by users. This can easily lead to situations where, temporarily, more DB connections are required than the default 10. Most of these scripts run only a few minutes, so increasing the number of DB connections does not mean "permanent saturation".

There could also be a mechanism that, for all tools, allows more than 10 connections for a limited time (say, 10 minutes), if such a thing is technically possible.

Bursting the connection limit like that isn't directly possible within the database system, as I understand. I can poke around where that is possible.
Overall, I think the Data-Services (Quota-requests) name sounds good to me.

I will suggest that we should make sure we record any approved limit increases like this similar to the record in puppet for Quarry so that we can find it later, along with the task number. That way, if the account must be reconstructed (because it was deleted) or has an issue, there is an easy way to verify what the approved quota was. This is especially true if we needed to recycle the auth credentials for an account on the replicas. The script will set the connection limit back to 10 for the account.