Page MenuHomePhabricator

All Wikimedia extensions that store their data outside the main database should use a virtual database domain
Open, Needs TriagePublic

Description

Per T330590: External LBs should not be exposed to developers, there is now a proper way to store data in a non-standard location without breaking automated DB schema updates or unit tests. The relevant Wikimedia extensions should use it, to be a little easier to set up.


A search for LBFactory finds the following extensions (didn't try to verify if they are all relevant; the extensions with dedicated subtasks of this task are marked as resolved):

  • Extension:AbuseFilter
  • Extension:ArticleFeedbackv5
  • Extension:Babel
  • Extension:BounceHandler
  • Extension:CampaignEvents (T348281)
  • Extension:CentralAuth
  • Extension:CheckUser (not relevant)
  • Extension:CirrusSearch
  • Extension:Cognate
  • Extension:ContentTranslation
  • Extension:DiscussionTools (not relevant)
  • Extension:Echo (via T348573)
  • Extension:EntitySchema
  • Extension:EventBus
  • Extension:Flow
  • Extension:GlobalBlocking (via T348486)
  • Extension:GlobalUsage
  • Extension:GrowthExperiments
  • Extension:LoginNotify
  • Extension:MachineVision (undeployed)
  • Extension:Math
  • Extension:MathSearch
  • Extension:OATHAuth
  • Extension:OAuthRateLimiter
  • Extension:PageTriage
  • Extension:Phonos
  • Extension:PropertySuggester
  • Extension:ReadingLists
  • Extension:SecurePoll
  • Extension:SplitPrivateWiki
  • Extension:UrlShortener
  • Extension:Wikibase
  • Extension:WikibaseQualityConstraints
  • Extension:WikimediaEditorTasks
  • Extension:WikimediaMaintenance

This doesn't find e.g. OAuth so it isn't a perfect search.


Example fix: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/UrlShortener/+/963293 (T330590)
and https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/963837

Related Objects

Event Timeline

Anyone got a link to a Gerrit patch showing the suggested fix?

ArticleFeedbackv5 isn't currently WMF-deployed, though who knows, perhaps it will once again be someday. :-) In the meanwhile, I definitely could use a helping hand with this...I wished to implement this change in AFTv5 in a non-BC-breaking way but the code turned out to be somewhat convoluted and I don't have a 1.41+ test setup to test out this part. What I can definitely say is that everything I tried out ended up breaking things like Special:Contributions (!) and such, so, yeah. :-/

Is there a patch I can take a look at?

I think simplest way to keep BC would be to use the MediaWikiServices hook to look at the configuration, and if $wgVirtualDomainsMapping['articlefeedback'] is not set but the legacy cluster/DB variables are set, copy that latter to the former.

Thanks for the swift replies, @Ladsgroup and @Tgr; much appreciated! :) I didn't submit a patch to gerrit as there was essentially nothing submission-worthy; quite the opposite, I had to revert my futile attempts to unbreak Special:Contributions so that I could test out some different extensions which hook into Special:Contributions as well.

So in ArticleFeedbackv5 there are three classes which define (and also use) a getLB method: DataModelBackendLBFactory (in /data/DataModelBackend.LBFactory.php), ArticleFeedbackv5BackendLBFactory (extends the previously mentioned class; in /includes/ArticleFeedbackv5BackendLBFactory.php) and finally ArticleFeedbackv5Utils (in /includes/ArticleFeedbackv5Utils.php).
Additionally the to-be-deprecated global variable $wgArticleFeedbackv5Cluster is used in three maintenance scripts in addition ot the ArticleFeedbackv5BackendLBFactory.php file, and I wouldn't have the slightest clue on how to update the waitForReplication calls in these: /maintenance/archiveFeedback.php, /maintenance/legacyToShard.php, and /maintenance/setArchiveDate.php.

One main issue I had was that the DataModelBackendLBFactory class has an allowCache method, which calls the class' getLB method and calls laggedReplicaUsed on whatever getLB returns. AIUI (please do correct me if I'm wrong) this would be problematic in the future if and when extensions shouldn't be using LBFactory the way e.g. AFTv5 currently does. When I was trying to refactor getLB and friends, the laggedReplicaUsed call ended up fataling and breaking Special:Contributions IIRC. I'd guess that all the other getLB uses would be easier to refactor, but that one really ended up confusing me.

@ashley sorry for the very slow response. waitForReplication can be replaced with IConnectionProvider::commitAndWaitForReplication. laggedReplicaUsed has no straightforward replacement; probably just ignore it and cache unconditionally? AFTv5 is not used on Wikimedia and it probably won't matter for smaller websites.

Change #1032800 had a related patch set uploaded (by Jack Phoenix; author: Jack Phoenix):

[mediawiki/extensions/ArticleFeedbackv5@master] Baby step(s) towards using virtual database domains - swap out laggedReplicaUsed usage

https://gerrit.wikimedia.org/r/1032800

I added a virtual domain parameter to Template:Extension on mediawiki.org. In theory it could be added automatically to the relevant pages via Tool-extjsonuploader, that's not done for now.

Change #1032800 merged by Jack Phoenix:

[mediawiki/extensions/ArticleFeedbackv5@master] Baby step(s) towards using virtual database domains - swap out laggedReplicaUsed usage

https://gerrit.wikimedia.org/r/1032800

Change #1063257 had a related patch set uploaded (by Jack Phoenix; author: Jack Phoenix):

[mediawiki/extensions/ArticleFeedbackv5@master] Use virtual database domain

https://gerrit.wikimedia.org/r/1063257