Page MenuHomePhabricator

LoadBalancer::resolveDomainId should support virtual domains
Open, Needs TriagePublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

There are no doubt easier ways to trigger the issue, but this is the one I stumbled on:

What happens?:

Wikimedia\Rdbms\DBConnectionError from line 1123 of /vagrant/mediawiki/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Cannot access the database: Unknown database 'virtual' (127.0.0.1)

Apparently what's happening is that the virtual DB domain virtual-centralauth is treated as a non-virtual DB domain and gets split into a DB and schema part. This is because LBFactory::getMappedDatabase() is only called in getPrimaryDatabase()/getReplicaDatabase() and not e.g. newMainLB(). This breaks / makes unaccessible various things for extensions using virtual domains, e.g. calls to $lbfactory->getMainLB(...)->hasOrMadeRecentPrimaryChanges() in CentralAuth, or ReadOnlyMode::getReason(). It's also just very confusing in general.

Event Timeline

For the record, the exact stack trace is

Wikimedia\Rdbms\DBConnectionError from line 1123 of /vagrant/mediawiki/includes/libs/rdbms/loadbalancer/LoadBalancer.php: Cannot access the database: Unknown database 'virtual' (127.0.0.1)
#0 /vagrant/mediawiki/includes/libs/rdbms/loadbalancer/LoadBalancer.php(779): Wikimedia\Rdbms\LoadBalancer->reportConnectionError()
#1 /vagrant/mediawiki/includes/libs/rdbms/loadbalancer/LoadBalancer.php(767): Wikimedia\Rdbms\LoadBalancer->getServerConnection()
#2 /vagrant/mediawiki/includes/libs/rdbms/database/DBConnRef.php(103): Wikimedia\Rdbms\LoadBalancer->getConnectionInternal()
#3 /vagrant/mediawiki/includes/libs/rdbms/database/DBConnRef.php(117): Wikimedia\Rdbms\DBConnRef->ensureConnection()
#4 /vagrant/mediawiki/includes/libs/rdbms/database/DBConnRef.php(351): Wikimedia\Rdbms\DBConnRef->__call()
#5 /vagrant/mediawiki/includes/utils/BatchRowIterator.php(248): Wikimedia\Rdbms\DBConnRef->select()
#6 /vagrant/mediawiki/includes/utils/BatchRowIterator.php(206): BatchRowIterator->next()
#7 /vagrant/mediawiki/extensions/AntiSpoof/maintenance/BatchAntiSpoofClass.php(94): BatchRowIterator->rewind()
#8 /vagrant/mediawiki/maintenance/includes/MaintenanceRunner.php(698): BatchAntiSpoof->execute()
#9 /vagrant/mediawiki/maintenance/run.php(51): MediaWiki\Maintenance\MaintenanceRunner->run()
#10 /var/www/w/MWScript.php(99): require_once('/vagrant/mediaw...')
#11 {main}

virtual domain is inherently different level than LB. It works on LBF not LB. Also on top of that, we want to get rid of using LB outside of rdbms altogether. To fix this, I think we need a different solution. Something out of the box. e.g. what is the use of resolveDomain here? Is it really needed? What can we replace it with?

We encountered it while trying to use a virtual domain in a maintenance script (via Maintenance::getDB()). But there are plenty of other instances of code doing $services->getDBLoadBalancerFactory()->getMainLB( $domain )->doSomethingWith( $domain ) and usually that something involves resolveDomain (see the examples in the task description).

IMO the reasonably thing would be to handle virtual domains in resolveDomain, and throw if the virtual domain is for a different load balancer. The LoadBalancer instance should know what cluster it is for, and what databases it has, so it can compare itself against the virtual domain specification, right?

The thing is that LB should not be used by outside of rdbms at all. It shouldn't care what a load balancer is. What outside of core should care is that there should be a service that would provide database objects (currently ICP) and database objects itself. How LBF manages its different connection pools should be completely hidden and encapsulated to the outside of rdbms. As such, I'm saying if you see LB is being referenced/used somewhere, you should think "how can this be moved up to ICP/LBF or moved down to DB?" Lots of times even use of LB is not needed and can be replaced with something simpler.

From an end user POV the LB/LBF differentation is pretty much never useful, if you need an LB you'll just do $lbf->getMainLB( $domain ). But the LB/LBF interfaces have all kinds of things that are not present on ICP but sometimes needed. Those could be moved to ICP but there is also benefit in keeping ICP simple and limited to functionality that's often used.