Page MenuHomePhabricator

Allow DB group used by ChangeDispatcher to be configured
Open, LowPublic

Description

To avoid hitting web-facing database servers with long running queries, ChangeDispatcher should use a configurable DB server group. To do this, we need to:

  1. move ConsistentReadConnectionManager to lib (or to core)
  2. allow the DB server group(s) to be set in ConsistentReadConnectionManager's constructor
  3. make EntityChangeLookup and SqlChangeDispatchCoordinator use ConsistentReadConnectionManager
  4. allow the DB server group to use for the ConsistentReadConnectionManager used by ChangeDispatcher to be defined from the command line.
  5. add a command line argument to dispatchChanges.php for specifying the DB server group.

The same should probably be done for DumpGenerator.

Event Timeline

daniel created this task.Jun 22 2016, 10:38 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 22 2016, 10:39 AM
daniel updated the task description. (Show Details)Jun 22 2016, 10:41 AM
Lydia_Pintscher triaged this task as Low priority.Jun 11 2017, 6:01 PM

https://gerrit.wikimedia.org/r/c/mediawiki/core/+/440022 would solve this too. Which db group we should use for dispatching? vslow? @jcrespo your input is very valuable here.

@Ladsgroup check the other existing groups, but unless it has a huge overlap with on of them, we should agree on a new one e.g. "dispatch". For that (or even if we reuse an existing one) we should add more servers to that group, assuming it is considerable load. Do you have stats in QPS or others?

There is nothing that has overlap so a group called dispatch (with db1087 only) I think makes sense. Regarding QPS, I can't give you any numbers right now and it's rather hard but given the graphs I can say there is four concurrent processes all the time spending 53% of their run time selecting from the database.

Let me ask in another way- which percentage, even with a lot of error, of work dispatch creates on s8 (wikidata), if we include jobqueue, webrequests and other traffic.

I would put at least 2 servers for redundancy reasons.