Page MenuHomePhabricator

LBFactory destructor causes unexpected exception at shutdown
Closed, ResolvedPublic

Description

Error

Request ID: XPfMttBQm20AAFgv6gQAAAAI

message
Wikimedia\Rdbms\DBUnexpectedError: DB connection was already closed.

#0 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/database/Database.php(3940): Wikimedia\Rdbms\Database->assertHasConnectionHandle()
#1 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1434): Wikimedia\Rdbms\Database->commit(string, string)
#2 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1807): Closure$Wikimedia\Rdbms\LoadBalancer::commitMasterChanges(Wikimedia\Rdbms\DatabaseMysqli)
#3 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1440): Wikimedia\Rdbms\LoadBalancer->forEachOpenMasterConnection(Closure$Wikimedia\Rdbms\LoadBalancer::commitMasterChanges;2625)
#4 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(227): Wikimedia\Rdbms\LoadBalancer->commitMasterChanges(string)
#5 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactoryMulti.php(401): Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod(Wikimedia\Rdbms\LoadBalancer, string, array)
#6 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(230): Wikimedia\Rdbms\LBFactoryMulti->forEachLB(Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod;2416, array)
#7 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(282): Wikimedia\Rdbms\LBFactory->forEachLBCallMethod(string, array)
#8 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(187): Wikimedia\Rdbms\LBFactory->commitMasterChanges(string)
#9 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(160): Wikimedia\Rdbms\LBFactory->shutdown(integer)
#10 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(744): Wikimedia\Rdbms\LBFactory->destroy()
#11 [internal function]: Wikimedia\Rdbms\LBFactory->__destruct()

.. and as side-effect:

message
PHP Warning:
Destructor threw an object exception: exception 'Wikimedia\Rdbms\DBTransactionError'

Wikimedia\Rdbms\DBTransactionError:
Wikimedia\Rdbms\LBFactory::shutdown: Commit failed on server(s) 10.64.16.79: DB connection was already closed.

/srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/loadbalancer/LoadBalancer.php:1445
Stack trace:
#0 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(227): Wikimedia\Rdbms\LoadBalancer->commitMasterChanges()
#1 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactoryMulti.php(401): Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod()
#2 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(230): Wikimedia\Rdbms\LBFactoryMulti->forEachLB()
#3 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(282): Wikimedia\Rdbms\LBFactory->forEachLBCallMethod()
#4 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(187): Wikimedia\Rdbms\LBFactory->commitMasterChanges()
#5 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(160): Wikimedia\Rdbms\LBFactory->shutdown()
#6 /srv/mediawiki/php-1.34.0-wmf.8/includes/libs/rdbms/lbfactory/LBFactory.php(744): Wikimedia\Rdbms\LBFactory->destroy()
#7 (): Wikimedia\Rdbms\LBFactory->__destruct()

Impact

No user-visible impact. The issue happens during request shut-down and the error is recoverable. The implicit side-effect
that happens as a result of this error happens to be the correct behaviour. We tried to close a connection that was already closed, and don't mind that it was already closed.

Notes

In production, this has been seen in Logstash for at least 30 days since 1.34-wmf.4:

For labswiki, this used to never happen, but has a very significant spike as of May 25:

Details

Event Timeline

Krinkle created this task.Jun 5 2019, 2:37 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 5 2019, 2:37 PM
Krinkle updated the task description. (Show Details)Jun 5 2019, 2:43 PM
Krinkle moved this task from Untriaged to Found longer ago on the Wikimedia-production-error board.
aaron claimed this task.Jun 6 2019, 11:40 AM
aaron triaged this task as Medium priority.
aaron moved this task from Inbox to Next In This Quarter / Jan-Mar 2020 on the Performance-Team board.

Seems related: rMW143333b172cd: rdbms: do not close the connection in LoadBalancerSingle::__destruct / https://gerrit.wikimedia.org/r/517350.

Not exactly the same though (commit is LoadBalancer, this bug is LBFactory), but might need a similar change?

aaron added a comment.Jun 22 2019, 5:56 PM

Seems related: rMW143333b172cd: rdbms: do not close the connection in LoadBalancerSingle::__destruct / https://gerrit.wikimedia.org/r/517350.
Not exactly the same though (commit is LoadBalancer, this bug is LBFactory), but might need a similar change?

That was an exceptional case of the connection being injected. Regular LBs should still close up on destruction.

Change 524320 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] Clean up DeferredUpdates transaction handling

https://gerrit.wikimedia.org/r/524320

Change 524364 had a related patch set uploaded (by Krinkle; owner: Aaron Schulz):
[mediawiki/core@wmf/1.34.0-wmf.14] Clean up DeferredUpdates transaction handling

https://gerrit.wikimedia.org/r/524364

Change 524320 merged by jenkins-bot:
[mediawiki/core@master] Clean up DeferredUpdates transaction handling

https://gerrit.wikimedia.org/r/524320

Is this still a train blocker?

aaron added a comment.Jul 19 2019, 6:11 AM

Is this still a train blocker?

I assume not, unless some further problems emerge.

If this is not a train blocker, please remove T220739 from parent tasks. If it is a blocker, priority should be set to UBN.

Change 524364 abandoned by Krinkle:
Clean up DeferredUpdates transaction handling

Reason:
Will go out in the wmf.15 train

https://gerrit.wikimedia.org/r/524364

Restricted Application added a project: Core Platform Team. · View Herald TranscriptJul 23 2019, 5:45 PM

Change 529199 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] rdbms: make LBFactory close/rollback dangling handles like LoadBalancer

https://gerrit.wikimedia.org/r/529199

Change 529199 merged by jenkins-bot:
[mediawiki/core@master] rdbms: make LBFactory close/rollback dangling handles like LoadBalancer

https://gerrit.wikimedia.org/r/529199

aaron closed this task as Resolved.Aug 25 2019, 1:18 AM
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:06 PM