Page MenuHomePhabricator

Abstract database connections between Client and Repository
Closed, ResolvedPublic13 Estimated Story Points

Description

This task looks at tackling 2 issues currently found in Wikibase, mainly exposed on Wikidata.

  1. It is easy to incorrectly wire connections and load balancers when different db clusters are used. See T281457
  2. We wan't a way to easily give all db connections from client or repo a "group" indicating where they come from (client / repo) See T262924#6498193

An abstraction should allow us to solve these 2 problems in a single place.
It would also be a first step in creating an abstraction between MediaWiki and Wikibase (though interfaces presented through such an abstraction would still be used (from the RDBMS lib package in core))

1)
Wikibase often needs to connect to multiple databases at once, most commonly the repository database and a client database.
These databases have different domains (db names) but can also live on different clusters, resulting in different LoadBalancer objects being needed.

The pattern that results out of this is that LBFactory objects, LoadBalancer objects and string dbnames are passed around between our Wikibase services.
We want to avoid this as this pattern leads to mistakes.

During changes on at least 2 occurrences in the past 12 months code has been merged with issues using the correct LoadBalancer objects.
This most recently happened with T281457: Several Wikibase services try to read from local domain, when they mean to access the repo.
TBA find the other ticket

See https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/602823 for an intial draft in this direction.
The idea being that we come up with a single service for db interaction (or services (repo & client)) within the Wikibase code to use within our services.
An added benefit here is an abstraction for a wait for replication on a LBFactory also lives here.

2)
In order for DBAs to be able to segment database traffic more effectively, we want to provide additional information when we request connections about the type of site / code the connection is being requested for.
This would, allow DBAs to allocate separate hardware dedicated to keeping client (Wikipedia) functionality online, even if database servers dedicated to repo functionality are overloaded for some other unrelated reason.

Proposed initial database groups:

  • When client code gets any wikibase related database connection a from-client group is used.
  • When repo code gets any wikibase related database connection a from-repo group is used.

Note: the draft gerrit patch above does not include any of this grouping code.

Acceptance Criteria:🏕️🌟

  • (1) An wikibase abstraction exists for database connections to repo and client databases (hiding the troubling patterns above)
  • (1.1) This abstraction is the only way database connections are acquired in wikibase code
  • (2) All wikibase related database connections have a "from" group that is passed to MediaWiki

Notes from storytime:

  • Jakob: I also still think we should ponder splitting the abstraction into concrete repo db connections and client db connections, so we can type hint for them
  • Tom: Perhaps this this live not in lib , but in our packages ADR 14

Related Objects

StatusSubtypeAssignedTask
ResolvedAddshore
ResolvedJakob_WMDE
ResolvedAddshore
ResolvedMichael
Resolved toan
ResolvedMichael
ResolvedLucas_Werkmeister_WMDE
ResolvedLadsgroup
ResolvedLucas_Werkmeister_WMDE
ResolvedItamarWMDE
ResolvedJakob_WMDE
ResolvedMichael
ResolvedLadsgroup
ResolvedJakob_WMDE
ResolvedJakob_WMDE
ResolvedMichael
ResolvedJakob_WMDE
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedItamarWMDE
ResolvedLadsgroup
ResolvedItamarWMDE
Resolveddang
ResolvedLucas_Werkmeister_WMDE
ResolvedLadsgroup
ResolvedLucas_Werkmeister_WMDE
ResolvedLucas_Werkmeister_WMDE
ResolvedLucas_Werkmeister_WMDE
Resolved toan
Resolved toan
ResolvedLucas_Werkmeister_WMDE
ResolvedLadsgroup
ResolvedLucas_Werkmeister_WMDE
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
ResolvedLadsgroup
StalledNone
ResolvedLucas_Werkmeister_WMDE

Event Timeline

Addshore renamed this task from It should not be so easy to mixup domains and dbs and connections in Wikibase context to Abstract database connections between Client and Repository.May 3 2021, 12:36 PM
Addshore updated the task description. (Show Details)
Addshore moved this task from Inbox to To Prioritize on the [DEPRECATED] wdwb-tech board.
Addshore set the point value for this task to 13.
Addshore claimed this task.

All subtasks closed, and I'll leave the final ADR related thing open and continue discussion there outside of this task