Page MenuHomePhabricator

Wikimedia\Rdbms\LoadBalancer::reuseConnection: got DBConnRef instance.
Closed, ResolvedPublicPRODUCTION ERROR

Description

This is getting logged at a rate of 92,000 per hour, currently, and that's just group 0+1 wikis.

Stack:

#0 /srv/mediawiki/php-1.30.0-wmf.1/includes/libs/rdbms/connectionmanager/ConnectionManager.php(109): Wikimedia\Rdbms\LoadBalancer->reuseConnection() 
#1 /srv/mediawiki/php-1.30.0-wmf.1/extensions/Cognate/src/CognateStore.php(177): Wikimedia\Rdbms\ConnectionManager->releaseConnection() 
#2 /srv/mediawiki/php-1.30.0-wmf.1/extensions/Cognate/src/CognateRepo.php(138): Cognate\CognateStore->selectLinkDetailsForPage() 
#3 /srv/mediawiki/php-1.30.0-wmf.1/extensions/Cognate/src/hooks/CognateParseHookHandler.php(81): Cognate\CognateRepo->getLinksForPage() 
#4 /srv/mediawiki/php-1.30.0-wmf.1/extensions/Cognate/src/CognateHooks.php(76): Cognate\CognateParseHookHandler->doContentAlterParserOutput() 
#5 /srv/mediawiki/php-1.30.0-wmf.1/includes/Hooks.php(186): Cognate\CognateHooks::onContentAlterParserOutput()
#6 /srv/mediawiki/php-1.30.0-wmf.1/includes/content/AbstractContent.php(501): Hooks::run() 
#7 /srv/mediawiki/php-1.30.0-wmf.1/includes/poolcounter/PoolWorkArticleView.php(140): AbstractContent->getParserOutput()
#8 /srv/mediawiki/php-1.30.0-wmf.1/includes/poolcounter/PoolCounterWork.php(123): PoolWorkArticleView->doWork() 
#9 /srv/mediawiki/php-1.30.0-wmf.1/includes/page/WikiPage.php(1076): PoolCounterWork->execute() 
#10 /srv/mediawiki/php-1.30.0-wmf.1/includes/api/ApiParse.php(510): WikiPage->getParserOutput()
#11 /srv/mediawiki/php-1.30.0-wmf.1/includes/api/ApiParse.php(190): ApiParse->getParsedContent()
#12 /srv/mediawiki/php-1.30.0-wmf.1/includes/api/ApiMain.php(1578): ApiParse->execute() 
#13 /srv/mediawiki/php-1.30.0-wmf.1/includes/api/ApiMain.php(545): ApiMain->executeAction() 
#14 /srv/mediawiki/php-1.30.0-wmf.1/includes/api/ApiMain.php(516): ApiMain->executeActionWithErrorHandling() 
#15 /srv/mediawiki/php-1.30.0-wmf.1/api.php(83): ApiMain->execute() 
#16 /srv/mediawiki/w/api.php(3): include() 
#17 {main}

The error is fired by LoadBalancer->reuseConnection() in LoadBalancer.php and the culprit is rECOG23e888af1af9: Release connections as early as possible in CognateStore

Event Timeline

It seems either:

  1. 23e888af1af9 didn't change anything because the connections were already using references to auto-release at the end of the function.
  2. or; 23e888af1af9 intended to release them earlier (within the same method) but should've replaced the refs with unwrapped DB objects - call getReadConnection() instead of getReadConnectionRef() and same for Write.

Unless this is a particularly slow method that is expected to run for a long time in a maintenance script or post-shutdown update or something, it probably makes more sense to assume 1 and revert the commit as being based on the assumption that it wasn't using references, which it actually was already.

This patch was made as a result of T164407 and making Cognate hold onto connections for as little time as possible.

  1. 23e888af1af9 didn't change anything because the connections were already using references to auto-release at the end of the function.

The change to CognateStore::insertPage() does change something. This releases the read connection before doing the write.

DBConnRef implements IDatabase and both ConnectionManager::releaseConnection and ILoadBalancer::reuseConnectionn are documents as accepting IDatabase, although looking into the implementation of Loadbalancer::reuseConnection it is clear it does not like DBConnRefs.
I would have expected the LB to actually handle this just fine.

Will change CognateStore to use regular connections rather than refs so that it can choose when to mark them for reuse.
And I'll probably write a patch adding some docs to the reuse and release methods here.

Change 353248 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Cognate@master] Don't pass ConnectionRefs to ConnectionManager::releaseConnection

https://gerrit.wikimedia.org/r/353248

Change 353249 had a related patch set uploaded (by Addshore; owner: Addshore):
[mediawiki/extensions/Cognate@wmf/1.30.0-wmf.1] Don't pass ConnectionRefs to ConnectionManager::releaseConnection

https://gerrit.wikimedia.org/r/353249

Change 353248 merged by jenkins-bot:
[mediawiki/extensions/Cognate@master] Don't pass ConnectionRefs to ConnectionManager::releaseConnection

https://gerrit.wikimedia.org/r/353248

Change 353249 merged by jenkins-bot:
[mediawiki/extensions/Cognate@wmf/1.30.0-wmf.1] Don't pass ConnectionRefs to ConnectionManager::releaseConnection

https://gerrit.wikimedia.org/r/353249

Mentioned in SAL (#wikimedia-operations) [2017-05-11T13:21:19Z] <addshore@tin> Synchronized php-1.30.0-wmf.1/extensions/Cognate/src/CognateStore.php: SWAT: T165005 [[gerrit:353249|Dont pass ConnectionRefs to ConnectionManager::releaseConnection]] (duration: 00m 42s)

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM