Page MenuHomePhabricator

Unexpected errors when ROLLBACK fails due to the DB server having "gone away"
Closed, ResolvedPublic

Description

TL;DR: performance expectation kills the transaction but does not clear up post-transaction callbacks; error handler trying to roll back the transaction aborts because the rollback function throws complaining about pending callbacks (potentially causing part of the error handling to be skipped). That seems pointless and probably shouldn't happen.

Seems like a query timeout:

Expectation (readQueryTime <= 30) by JobRunner::run not met (actual: 83.462260007858):
query-m: SELECT rev_timestamp FROM `revision` WHERE rev_user_text = 'X' ORDER BY rev_timestamp ASC  [TRX#a8aa24]
#0 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/TransactionProfiler.php(223): Wikimedia\Rdbms\TransactionProfiler->reportExpectationViolated()
#1 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(1055): Wikimedia\Rdbms\TransactionProfiler->recordQueryCompletion()
#2 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(953): Wikimedia\Rdbms\Database->doProfiledQuery()
#3 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(1380): Wikimedia\Rdbms\Database->query()
#4 /srv/mediawiki/php-1.31.0-wmf.22/extensions/Renameuser/RenameuserSQL.php(262): Wikimedia\Rdbms\Database->select()
#5 /srv/mediawiki/php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameUserJob.php(86): RenameuserSQL->rename()
#6 /srv/mediawiki/php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php(63): LocalRenameUserJob->doRun()
#7 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(294): LocalRenameJob->run()
#8 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(193): JobRunner->executeJob()
#9 /srv/mediawiki/rpc/RunJobs.php(47): JobRunner->run()
#10 {main}

(log entry)
This fails the job, which calls rollbackMasterChanges, which dies with

RuntimeException from line 776 of /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php: Transaction callbacks still pending.
#0 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/DatabaseMysqlBase.php(129): Wikimedia\Rdbms\Database->close()
#1 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(3257): Wikimedia\Rdbms\DatabaseMysqlBase->open(string, string, string, string)
#2 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(963): Wikimedia\Rdbms\Database->reconnect()
#3 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(3139): Wikimedia\Rdbms\Database->query(string, string, boolean)
#4 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/database/Database.php(3105): Wikimedia\Rdbms\Database->doRollback(string)
#5 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1376): Wikimedia\Rdbms\Database->rollback(string, string)
#6 [internal function]: Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges(Wikimedia\Rdbms\DatabaseMysqli)
#7 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1591): call_user_func_array(Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges;4899, array)
#8 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1382): Wikimedia\Rdbms\LoadBalancer->forEachOpenMasterConnection(Closure$Wikimedia\Rdbms\LoadBalancer::rollbackMasterChanges;4899)
#9 [internal function]: Wikimedia\Rdbms\LoadBalancer->rollbackMasterChanges(string)
#10 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactory.php(184): call_user_func_array(array, array)
#11 [internal function]: Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod(Wikimedia\Rdbms\LoadBalancer, string, array)
#12 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactoryMulti.php(425): call_user_func_array(Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod;4824, array)
#13 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactory.php(187): Wikimedia\Rdbms\LBFactoryMulti->forEachLB(Closure$Wikimedia\Rdbms\LBFactory::forEachLBCallMethod;4824, array)
#14 /srv/mediawiki/php-1.31.0-wmf.22/includes/libs/rdbms/lbfactory/LBFactory.php(247): Wikimedia\Rdbms\LBFactory->forEachLBCallMethod(string, array)
#15 /srv/mediawiki/php-1.31.0-wmf.22/extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php(67): Wikimedia\Rdbms\LBFactory->rollbackMasterChanges(string)
#16 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(294): LocalRenameJob->run()
#17 /srv/mediawiki/php-1.31.0-wmf.22/includes/jobqueue/JobRunner.php(193): JobRunner->executeJob(LocalRenameUserJob, Wikimedia\Rdbms\LBFactoryMulti, BufferingStatsdDataFactory, integer)
#18 /srv/mediawiki/rpc/RunJobs.php(47): JobRunner->run(array)
#19 {main}

(log event)
so an exception is thrown from the exception handler and the catch block cannot set the rename status to failed (which would allow for retries).
This is pretty broken on the core DB handling level. @aaron any idea how to fix it?

Event Timeline

Tgr created this task.Mar 4 2018, 8:51 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 4 2018, 8:51 PM
aaron added a comment.Mar 8 2018, 8:29 PM

Note that the code making the "Expectation (readQueryTime <= 30) by JobRunner::run" logs does not roll anything back.

There is code in approveMasterChanges (triggered by LBFactory::commitMasterChanges) that might rollback.

Change 417452 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] rdbms: remove exception from Database::close()

https://gerrit.wikimedia.org/r/417452

Change 417452 merged by jenkins-bot:
[mediawiki/core@master] rdbms: avoid throwing exceptions in Database::close() on reconnect

https://gerrit.wikimedia.org/r/417452

aaron renamed this task from Slow transaction killer breaks error handling to Unexpected errors when ROLLBACK fails due to the DB server having "gone away".Mar 14 2018, 8:07 AM
aaron closed this task as Resolved.
aaron claimed this task.