Page MenuHomePhabricator

Wikibase: Introduce separate database configuration for term store
Open, Needs TriagePublic8 Estimated Story Points

Description

In order to accommodate for the storage usage, Wikidata will provide terms-related tables from a separate server/cluster than the rest of Mediawiki/Wikibase tables.

In order to allow make use of those, Wikibase persistence logic needs to use the right database/cluster for the relevant queries.

Currently in Wikibase the database details and connections are modeled through DomainDb classes.
It is expected that Wikibases other than Wikidata wouldn't need to separate database tables, so two "pointers" could still lead to one database. The "term-database" connection setting should be optional, and unless specified differently, assume to use the "general" Mediawiki/Wikibase database.

Original description for history/context below:


Currently term store is reaching 340GB in wikidata and slowly reaching the wb_terms era. To allow splitting s8 into a core cluster and a dedicated cluster for term store (tentatively called x3), we need to make term store take a different domain db than the core db of wikidata (and should be configurable and initially pointing to wikidata db). This would free up space for wikidata and reduces write pressure and allows for more horizontal scaling.

As an example, see how RepoDomainDb is injected in TermStoreWriterFactory or DatabaseTermStoreWriterBase.

Event Timeline

Created T351820: Move Wikidata term store to separate database cluster as the more general task for the project suggested in the task description, since this task is limited to the code changes in Wikibase IIUC.

Does WPP in the title stand for Wikibase Product Platform Team WPP or something else? (Asking because the current subscribers are closer to Wikidata Dev Team / Wikidata Dev Team (Wikidata.org Slice), and I’m also not sure this task belongs with product platform.)

I’ll also be very interested in how moving the term store to a separate cluster, with (I assume) separate transactions, will affect the deadlocks we’re currently seeing in the term store (T283198).

Created T351820: Move Wikidata term store to separate database cluster as the more general task for the project suggested in the task description, since this task is limited to the code changes in Wikibase IIUC.

Thanks

Does WPP in the title stand for Wikibase Product Platform Team WPP or something else? (Asking because the current subscribers are closer to Wikidata Dev Team / Wikidata Dev Team (Wikidata.org Slice), and I’m also not sure this task belongs with product platform.)

I was asked by Itamar, maybe I misunderstood them :D

no misunderstanding at all :), I just meant that the following project tags should be added: wmde-wikidata-tech and Wikibase Product Platform Team WPP also I think I was thrown off a bit by the parent task, let's keep all discussion here, I'd say.

More information from the parent task T351820#9352829:

Clarification, we will not do anything until at least start of next US FY (as we need to budget a couple more dbs for extra headroom) so we have at least six months.

Part of my confusion was because to me this feels more like our responsibility, but I’m also happy for the product platform team to take it over ^^

Using virtual domains should make this quite easier (famous last words)

Hi, We are currently buying the hardware for this. Any updates?

So who’s responsible for this task at WMDE? The current project columns kind of sound like neither the Wikidata team nor the Wikibase Product Platform Team consider themselves responsible (“Radar” / “outside WPP”), which would be unfortunate.

image.png (198×280 px, 19 KB)

@Ladsgroup hallochen, when would you ideally like an answer by?

@Ladsgroup hallochen, when would you ideally like an answer by?

As soon as possible. We had time a year ago.

WMDE-leszek renamed this task from WPP: Make term store database configurable to Wikibase: Introduce separate database configuration for term store.Mon, Nov 25, 1:54 PM
WMDE-leszek updated the task description. (Show Details)

Change #1102897 had a related patch set uploaded (by Jakob; author: Jakob):

[mediawiki/extensions/Wikibase@master] Introduce TermsDomainDb

https://gerrit.wikimedia.org/r/1102897

Change #1104692 had a related patch set uploaded (by Jakob; author: Jakob):

[mediawiki/extensions/Wikibase@master] TermsDomainDb: Avoid ConnectionManager & ReplicationWaiter

https://gerrit.wikimedia.org/r/1104692

Change #1104964 had a related patch set uploaded (by Jakob; author: Jakob):

[mediawiki/extensions/Wikibase@master] Make TermsDomainDb an interface

https://gerrit.wikimedia.org/r/1104964

Change #1102897 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Introduce TermsDomainDb

https://gerrit.wikimedia.org/r/1102897

Change #1104692 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] TermsDomainDb: Avoid ConnectionManager & ReplicationWaiter

https://gerrit.wikimedia.org/r/1104692

Change #1104964 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Make TermsDomainDb an interface

https://gerrit.wikimedia.org/r/1104964

Change #1105383 had a related patch set uploaded (by Jakob; author: Jakob):

[mediawiki/extensions/Wikibase@master] Add virtual-wikibase-terms virtual domain

https://gerrit.wikimedia.org/r/1105383