Page MenuHomePhabricator

Wikimedia\Rdbms\DBTransactionStateError: Cannot execute query ... while transaction status is ERROR (after PHP timeout)
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
normalized_message
[{reqId}] {exception_url}   Wikimedia\Rdbms\DBTransactionStateError: Cannot execute query from MediaWiki\Extension\AbuseFilter\Watcher\UpdateHitCountWatcher::updateHitCounts while transaction status is ERROR
A connection error occurred during a query. 
Query: SELECT  actor_user,actor_name,actor_id  FROM `actor`    WHERE actor_name = 'XXXXXX'  LIMIT 1  
Function: MediaWiki\User\ActorStore::findActorIdInternal
Error: 2006 MySQL server has gone away (db1138)
exception.trace
from /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(1619)
#0 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(1320): Wikimedia\Rdbms\Database->assertQueryIsCurrentlyAllowed(string, string)
#1 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(2654): Wikimedia\Rdbms\Database->query(string, string, integer)
#2 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->update(string, array, string, string)
#3 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/DBConnRef.php(386): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#4 /srv/mediawiki/php-1.38.0-wmf.4/extensions/AbuseFilter/includes/Watcher/UpdateHitCountWatcher.php(60): Wikimedia\Rdbms\DBConnRef->update(string, array, array, string)
#5 /srv/mediawiki/php-1.38.0-wmf.4/extensions/AbuseFilter/includes/Watcher/UpdateHitCountWatcher.php(41): MediaWiki\Extension\AbuseFilter\Watcher\UpdateHitCountWatcher->updateHitCounts(Wikimedia\Rdbms\DBConnRef, array)
#6 /srv/mediawiki/php-1.38.0-wmf.4/includes/deferred/MWCallableUpdate.php(38): MediaWiki\Extension\AbuseFilter\Watcher\UpdateHitCountWatcher->MediaWiki\Extension\AbuseFilter\Watcher\{closure}()
#7 /srv/mediawiki/php-1.38.0-wmf.4/includes/deferred/DeferredUpdates.php(515): MWCallableUpdate->doUpdate()
#8 /srv/mediawiki/php-1.38.0-wmf.4/includes/deferred/DeferredUpdates.php(391): DeferredUpdates::attemptUpdate(MWCallableUpdate, Wikimedia\Rdbms\LBFactoryMulti)
#9 /srv/mediawiki/php-1.38.0-wmf.4/includes/deferred/DeferredUpdates.php(221): DeferredUpdates::run(MWCallableUpdate, Wikimedia\Rdbms\LBFactoryMulti, Monolog\Logger, BufferingStatsdDataFactory, string)
#10 /srv/mediawiki/php-1.38.0-wmf.4/includes/deferred/DeferredUpdatesScope.php(267): DeferredUpdates::{closure}(MWCallableUpdate, integer)
#11 /srv/mediawiki/php-1.38.0-wmf.4/includes/deferred/DeferredUpdatesScope.php(196): DeferredUpdatesScope->processStageQueue(integer, integer, Closure)
#12 /srv/mediawiki/php-1.38.0-wmf.4/includes/deferred/DeferredUpdates.php(242): DeferredUpdatesScope->processUpdates(integer, Closure)
#13 /srv/mediawiki/php-1.38.0-wmf.4/includes/MediaWiki.php(1136): DeferredUpdates::doUpdates()
#14 /srv/mediawiki/rpc/RunSingleJob.php(93): MediaWiki->restInPeace()
#15 {main}
exception.previous.trace
from /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(1807)
#0 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(1423): Wikimedia\Rdbms\Database->getQueryException(string, integer, string, string)
#1 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(1323): Wikimedia\Rdbms\Database->executeQuery(string, string, integer)
#2 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(2012): Wikimedia\Rdbms\Database->query(string, string, integer)
#3 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/Database.php(2112): Wikimedia\Rdbms\Database->select(string, array, array, string, array, array)
#4 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->selectRow(string, array, array, string, array)
#5 /srv/mediawiki/php-1.38.0-wmf.4/includes/libs/rdbms/database/DBConnRef.php(342): Wikimedia\Rdbms\DBConnRef->__call(string, array)
#6 /srv/mediawiki/php-1.38.0-wmf.4/includes/user/ActorStore.php(381): Wikimedia\Rdbms\DBConnRef->selectRow(string, array, array, string, array)
#7 /srv/mediawiki/php-1.38.0-wmf.4/includes/user/ActorStore.php(412): MediaWiki\User\ActorStore->findActorIdInternal(string, Wikimedia\Rdbms\MaintainableDBConnRef)
#8 /srv/mediawiki/php-1.38.0-wmf.4/includes/filerepo/file/LocalFile.php(1899): MediaWiki\User\ActorStore->acquireActorId(User, Wikimedia\Rdbms\MaintainableDBConnRef)
#9 /srv/mediawiki/php-1.38.0-wmf.4/includes/filerepo/file/LocalFile.php(1829): LocalFile->recordUpload3(string, string, string, User, array, string, array, boolean, boolean)
#10 /srv/mediawiki/php-1.38.0-wmf.4/includes/upload/UploadBase.php(955): LocalFile->upload(string, string, string, integer, array, boolean, User, array)
#11 /srv/mediawiki/php-1.38.0-wmf.4/includes/jobqueue/jobs/PublishStashedFileJob.php(88): UploadBase->performUpload(string, string, boolean, User, array, NULL)
#12 /srv/mediawiki/php-1.38.0-wmf.4/extensions/EventBus/includes/JobExecutor.php(79): PublishStashedFileJob->run()
#13 /srv/mediawiki/rpc/RunSingleJob.php(76): MediaWiki\Extension\EventBus\JobExecutor->execute(array)
#14 {main}
Impact
Notes

A single occurrence in production. I don't know whether "MySQL server has gone away" error must be handled there or whether the deferred updates should be retried after X time.

Event Timeline

Daimona added a subscriber: Daimona.

This looks like a classic MySQL timeout, which could be caused by several things, and happens to have AF in its stack trace. I'm adding a couple of tags hoping that somebody more familiar with the related areas has some ideas about this, but if it's just a timeout this can probably be closed.

Krinkle moved this task from Inbox to Doing: Prio Interrupt on the Performance-Team board.
Krinkle added a subscriber: Krinkle.

Seems like the timeout should not leave the transaction in an error state. Might be a state bug in rdbms, or rdbms being called incorrectly by ActorStore or DeferredUpdates.

Krinkle renamed this task from Wikimedia\Rdbms\DBTransactionStateError: Cannot execute query from MediaWiki\Extension\AbuseFilter\Watcher\UpdateHitCountWatcher::updateHitCounts while transaction status is ERROR to Wikimedia\Rdbms\DBTransactionStateError: Cannot execute query ... while transaction status is ERROR (after PHP timeout).Nov 22 2021, 7:57 PM

Change 737824 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] [WIP] rdbms: make automatic connection recovery apply to more cases

https://gerrit.wikimedia.org/r/737824

aaron triaged this task as Medium priority.Jan 7 2022, 1:22 AM

Change 737824 merged by jenkins-bot:

[mediawiki/core@master] rdbms: make automatic connection recovery apply to more cases

https://gerrit.wikimedia.org/r/737824

Change 771755 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] rdbms: add mysql timeout error codes to wasKnownStatementRollbackError()

https://gerrit.wikimedia.org/r/771755

Change 771755 merged by jenkins-bot:

[mediawiki/core@master] rdbms: add mysql timeout error codes to wasKnownStatementRollbackError()

https://gerrit.wikimedia.org/r/771755

Change 773352 had a related patch set uploaded (by Aaron Schulz; author: Aaron Schulz):

[mediawiki/core@master] rdbms: directly use executeQuery() in mysql open() method

https://gerrit.wikimedia.org/r/773352

Change 773352 merged by jenkins-bot:

[mediawiki/core@master] rdbms: directly use executeQuery() in mysql open() method

https://gerrit.wikimedia.org/r/773352

Logstash mediawiki-warnings query:

message:"while transaction status is ERROR" AND "MySQL server has gone away"

Still seen, but not since March 27. Will check again next week when wmf.6 has been on all wikis for a bit.