Page MenuHomePhabricator

WPP: Make term store database configurable
Open, Needs TriagePublic

Description

Currently term store is reaching 340GB in wikidata and slowly reaching the wb_terms era. To allow splitting s8 into a core cluster and a dedicated cluster for term store (tentatively called x3), we need to make term store take a different domain db than the core db of wikidata (and should be configurable and initially pointing to wikidata db). This would free up space for wikidata and reduces write pressure and allows for more horizontal scaling.

As an example, see how RepoDomainDb is injected in TermStoreWriterFactory or DatabaseTermStoreWriterBase.

Event Timeline

Created T351820: Move Wikidata term store to separate database cluster as the more general task for the project suggested in the task description, since this task is limited to the code changes in Wikibase IIUC.

Does WPP in the title stand for Wikibase Product Platform Team WPP or something else? (Asking because the current subscribers are closer to Wikidata Dev Team / Wikidata Dev Team (Wikidata.org Slice), and I’m also not sure this task belongs with product platform.)

I’ll also be very interested in how moving the term store to a separate cluster, with (I assume) separate transactions, will affect the deadlocks we’re currently seeing in the term store (T283198).

Created T351820: Move Wikidata term store to separate database cluster as the more general task for the project suggested in the task description, since this task is limited to the code changes in Wikibase IIUC.

Thanks

Does WPP in the title stand for Wikibase Product Platform Team WPP or something else? (Asking because the current subscribers are closer to Wikidata Dev Team / Wikidata Dev Team (Wikidata.org Slice), and I’m also not sure this task belongs with product platform.)

I was asked by Itamar, maybe I misunderstood them :D

no misunderstanding at all :), I just meant that the following project tags should be added: wmde-wikidata-tech and Wikibase Product Platform Team WPP also I think I was thrown off a bit by the parent task, let's keep all discussion here, I'd say.

More information from the parent task T351820#9352829:

Clarification, we will not do anything until at least start of next US FY (as we need to budget a couple more dbs for extra headroom) so we have at least six months.

Part of my confusion was because to me this feels more like our responsibility, but I’m also happy for the product platform team to take it over ^^

Using virtual domains should make this quite easier (famous last words)