Page MenuHomePhabricator

Investigate coupling in data access layer
Closed, ResolvedPublic

Description

Bits of the data access layer to consider are:

  • Code that reads and writes to mw db on our behalf
  • Secondary storage (terms tables)
  • Data Model services lib
  • ElasticSearch (Should the be included?)
  • WikibaseCirrusSearch, WikibaseLexemeCirrusSearch

Event Timeline

Elastic
ElasticSearch - Seems we only use TermLookupSearcher

Only used for ElasticTermLookup

Seems to only search for titles. Also seems to be unused therefore created follow-up task and patch to remove.

DB
Database tables and schema. Investigated by looking up specific known table names. Also grepping for ILoadBalancer $dbr and $dbw
Database Schema Updater
mediawiki/extensions/Wikibase/repo/includes/Store/Sql/DatabaseSchemaUpdater.php

Will warrant a special look since sooo many tables are mentioned here even though they may well be used by client.

wb_items_per_site

Name of the table is defined in multiple places:

extensions/Wikibase/client/includes/Store/Sql/DirectSqlStore.php:307

extensions/Wikibase/client/includes/Store/Sql/DirectSqlStore.php:307

extensions/Wikibase/repo/includes/Store/Sql/ChangesSubscriptionTableBuilder.php:205

extensions/Wikibase/repo/maintenance/pruneItemsPerSite.php:66

extensions/Wikibase/repo/includes/Store/Sql/SqlSiteLinkConflictLookup.php:72

extensions/Wikibase/repo/includes/Store/Sql/SqlStore.php:326

extensions/Wikibase/repo/maintenance/rebuildItemsPerSite.php:65

See the split across remote and client. It looks like the actual logic/structure is mostly shared in lib in SiteLinkTable but direct access to the table is clearly made in e.g. SqlSiteLinkConflictLookup

wb_id_counters

Pretty nicely isolated to repo

wb_changes_subscription

Mostly client. Small coupling in repo with:

  • ListSubscribers.php (API)
  • SqlSubScriptionLookup
  • populateChangesSubscription
  • ChangesSubscriptionSchemaUpdater

wb_property_info

Used in Repo table in Lib

wb_changes

Literal binding to raw reading of the table

Some intermediate paste:

In Repo:

  • ChangePruner
  • DispatchStats
  • SqlChangeStore
  • SqlChangeDispatchCoordinator

EntityChangeLookup in Lib

wb_changes_dispatch

  • ChangePruner
  • DispatchStats
  • SqlChangeDispatchCoordinator

wbc_entity_usage

All in client!

page_props

Some possible relation here that I'm not quite able to discern

\Wikibase\Client\Store\Sql\PagePropsEntityIdLookup

\Wikibase\Client\Usage\Sql\EntityUsageTableBuilder

\Wikibase\Repo\Store\Sql\SqlItemsWithoutSitelinksFinder

I also did some more digging in MW after our morning meeting today. There is also coupling on the keynames from auto comments:
See: EntityChange in Lib and \Wikibase\Client\RecentChanges\ChangeLineFormatter

SummaryQuite some coupling of things like plain db name. Most of the coupling unsurprisingly comes from changes/dispatching.
Risks, threats, challenges identifiedDatabaseSchemaUpdater touches almost all tables used in client. Client accesses entity data without EntityContent (a Repo class), bypassing a MediaWiki layer and directly using the blob store.
Opportunities noticedElasticsearch was almost nicely encapsulated in Cirrus. Patch up to remove the one exception (unused anyway)
Other remarksThere could be more “MediaWiki” internal things that were missed (c.f. page_props)