Handling of external databases (vs. core databases) in mw needs rethinking and redesign:
- What does external in here means? Local db of another wiki can be considered external. In here the code means x1/x2 and esX only which is confusing to devs.
- It's an infrastructure-specific design (and a good one to scale wikis) that developers don't need to care (beside avoiding joining with core tables)
- This has led to many extensions dealing with external storage through configuration (each extension has a dedicated configuration for it) and "creative solutions": T314908
- Its testablity is questionable at best: T314908
- It can't be properly wired in the updater/installer
- We are planning to migrate tables out of core tables (like globalimageslinks) and for new usages force devs to use extension clusters as much as possible (=to allow better scalability) and making this easier is important in reaching such goal.
Proposed solution:
- For tables that are in x1 but in wiki's db (e.g. they are in "arzwiki" database in x1). Introduce concept of "virtual domain" (h/t Tgr in T314908). An attribute to be added to extension.json (like "DatabaseVirtualDomains")
- Add a new core config: Something like:
$wgVirtualDomainsMapping => [ 'urlshortener' => [ 'cluster' => 'extension1', 'db' => 'wikishared' ] 'growthexperiments' => [ 'cluster' => 'extension1' ], 'echo-shared' => [ 'cluster' => 'extension1', 'db' => 'wikishared' ], 'echo-local' => [ 'cluster' => 'extension1' ], // ... ];
Which would mean:
- LBF::getPrimaryDatabase( 'urlshortener' ) would internally translates into $this->getExternalLB( 'extension1' )->getConnection( DB_PRIMARY, [], 'wikishared' )
- LBF::getPrimaryDatabase( 'growthexperiments' ) would internally translates into $this->getExternalLB( 'extension1' )->getConnection( DB_PRIMARY, [], false )
- If the virtual domain is not in $wgVirtualDomainsMapping keys, it goes to MainLB. Think of localhost.
This solution at least in paper allows us to get rid of a lot code duplications (just search for "getExternalLB" in extensions), a lot of configurations, enables better testablaity (possibly marking some domains as external in tests and giving them a different db breaking any attempt to join), and possibly wiring it up properly in DatabaseUpdater.
Open questions:
- The default behavior would be if the domain doesn't exist in externalDomains, it translates to being a database name (e.g. commonswiki) in mainLB. That is generally fine except for case of "virtual domains" like growthexperiments and echo if it's not set up in config (local setups, tests, 3rd parties, etc.).
- Suggested solution: configuration of 'externalDomains' => [ 'growthexperiments' => [] ] would mean LBF::getPrimaryDatabase( 'growthexperiments' )will do mainLB but without $domain being passed.
- or have a dedicated parameter like 'virtualDomains' => [ 'growthexperiments', 'echo_per_wiki' ] that would hint how they need to be handled. I like this one better
- Regardless of which direction, extensions need to have a to alter that default value (in prod it's not important, this is for tests and local setups and third parties, etc.). I guess a hook would work here? A new attribute in extension.json seems like an overkill.