Page MenuHomePhabricator

Fail-safe exception from ReplicaMasterAwareRecordIdsAcquirer via Wikibase DataUpdateAdapter::doUpdate
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

MediaWiki version: 1.35.0-wmf.23

message
PHP Warning: [data-update-failed]: A data update callback triggered an exception (Fail-safe exception. Avoiding infinite loop due to possibily undetectable existing records in master.
 It may be due to encoding incompatibility between database values and values passed in $neededRecords parameter.) [Called from Wikibase\Repo\Content\DataUpdateAdapter::doUpdate in /srv/mediawiki/php-1.35.0-wmf.23/extensions/Wikibase/repo/includes/Content/DataUpdateAdapter.php at line 62]

Impact

Notes

Details

Request ID
XmqKEQpAMNoAAhz8BDAAAAAW
Request URL
https://www.wikidata.org/w/api.php
Stack Trace
exception.trace
#0 [internal function]: MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /srv/mediawiki/php-1.35.0-wmf.23/includes/debug/MWDebug.php(333): trigger_error(string, integer)
#2 /srv/mediawiki/php-1.35.0-wmf.23/includes/debug/MWDebug.php(188): MWDebug::sendMessage(string, array, string, integer)
#3 /srv/mediawiki/php-1.35.0-wmf.23/includes/GlobalFunctions.php(1068): MWDebug::warning(string, integer, integer, string)
#4 /srv/mediawiki/php-1.35.0-wmf.23/extensions/Wikibase/lib/includes/Reporting/LogWarningExceptionHandler.php(28): wfLogWarning(string, integer)
#5 /srv/mediawiki/php-1.35.0-wmf.23/extensions/Wikibase/repo/includes/Content/DataUpdateAdapter.php(62): Wikibase\Lib\Reporting\LogWarningExceptionHandler->handleException(Exception, string, string)
#6 /srv/mediawiki/php-1.35.0-wmf.23/includes/deferred/DeferredUpdates.php(417): Wikibase\Repo\Content\DataUpdateAdapter->doUpdate()
#7 /srv/mediawiki/php-1.35.0-wmf.23/includes/Storage/DerivedPageDataUpdater.php(1648): DeferredUpdates::attemptUpdate(Wikibase\Repo\Content\DataUpdateAdapter, Wikimedia\Rdbms\LBFactoryMulti)
#8 /srv/mediawiki/php-1.35.0-wmf.23/includes/Storage/DerivedPageDataUpdater.php(1458): MediaWiki\Storage\DerivedPageDataUpdater->doSecondaryDataUpdates(array)
#9 /srv/mediawiki/php-1.35.0-wmf.23/includes/deferred/MWCallableUpdate.php(38): MediaWiki\Storage\DerivedPageDataUpdater->MediaWiki\Storage\{closure}()
#10 /srv/mediawiki/php-1.35.0-wmf.23/includes/deferred/DeferredUpdates.php(417): MWCallableUpdate->doUpdate()
#11 /srv/mediawiki/php-1.35.0-wmf.23/includes/deferred/DeferredUpdates.php(296): DeferredUpdates::attemptUpdate(MWCallableUpdate, Wikimedia\Rdbms\LBFactoryMulti)
#12 /srv/mediawiki/php-1.35.0-wmf.23/includes/deferred/DeferredUpdates.php(233): DeferredUpdates::run(MWCallableUpdate, Wikimedia\Rdbms\LBFactoryMulti, Monolog\Logger, BufferingStatsdDataFactory, string)
#13 /srv/mediawiki/php-1.35.0-wmf.23/includes/deferred/DeferredUpdates.php(150): DeferredUpdates::handleUpdateQueue(array, string, integer)
#14 /srv/mediawiki/php-1.35.0-wmf.23/includes/MediaWiki.php(1058): DeferredUpdates::doUpdates(string)
#15 /srv/mediawiki/php-1.35.0-wmf.23/includes/MediaWiki.php(838): MediaWiki->restInPeace()
#16 /srv/mediawiki/php-1.35.0-wmf.23/includes/MediaWiki.php(847): MediaWiki->{closure}()
#17 /srv/mediawiki/php-1.35.0-wmf.23/api.php(119): MediaWiki->doPostOutputShutdown()
#18 /srv/mediawiki/w/api.php(3): require(string)
#19 {main}

Event Timeline

Restricted Application added a subscriber: Liuxinyu970226. · View Herald Transcript

Currently under discussion in #wikimedia-operations, seems likely a fix is inbound.

On IRC it's suggested that this is a dupe of T246898, made worse by the config change.

This appears to have been mitigated by the above patch. Unclear whether this task should remain open, but removing as train blocker.

WMDE-leszek lowered the priority of this task from Unbreak Now! to High.Mar 16 2020, 10:51 AM
WMDE-leszek added a subscriber: WMDE-leszek.

Lowering the priority given this is not blocking a train any more.

Change 580302 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/Wikibase@master] Store item terms at late as possible to avoid deadlocks

https://gerrit.wikimedia.org/r/580302

Change 580328 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.23] Store item terms at late as possible to avoid deadlocks

https://gerrit.wikimedia.org/r/580328

Change 580302 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Store item terms at late as possible to avoid deadlocks

https://gerrit.wikimedia.org/r/580302

Change 580328 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.23] Store item terms at late as possible to avoid deadlocks

https://gerrit.wikimedia.org/r/580328

Mentioned in SAL (#wikimedia-operations) [2020-03-17T14:14:34Z] <ladsgroup@deploy1001> Synchronized php-1.35.0-wmf.23/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseItemTermStoreWriter.php: [[gerrit:580328|Store item terms at late as possible to avoid deadlocks (T247553 T246898)]] (duration: 01m 07s)

Change 580342 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/Wikibase@master] Do not lock rows when there's no term returned

https://gerrit.wikimedia.org/r/580342

Change 580352 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.23] Do not lock rows when there's no term returned

https://gerrit.wikimedia.org/r/580352

Change 580342 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Do not lock rows when there's no term returned

https://gerrit.wikimedia.org/r/580342

Change 580390 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.24] Do not lock rows when there's no term returned

https://gerrit.wikimedia.org/r/580390

Change 580352 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.23] Do not lock rows when there's no term returned

https://gerrit.wikimedia.org/r/580352

Mentioned in SAL (#wikimedia-operations) [2020-03-17T17:46:05Z] <ladsgroup@deploy1001> Synchronized php-1.35.0-wmf.23/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseItemTermStoreWriter.php: [[gerrit:580352|Do not lock rows when there's no term returned (T247553 T246898)]] (duration: 01m 07s)

Change 580390 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@wmf/1.35.0-wmf.24] Do not lock rows when there's no term returned

https://gerrit.wikimedia.org/r/580390

Mentioned in SAL (#wikimedia-operations) [2020-03-17T18:53:11Z] <ladsgroup@deploy1001> Synchronized php-1.35.0-wmf.24/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseItemTermStoreWriter.php: [[gerrit:580390|Do not lock rows when there's no term returned (T247553 T246898)]], To catch the train (duration: 01m 08s)

Krinkle renamed this task from PHP Warning: [data-update-failed]: A data update callback triggered an exception (Fail-safe exception. Avoiding infinite loop due to possibily undetectable existing records in master. It may be due to encoding incompatibility between database values and values passed in $neededRecords parameter.) [Called from Wikibase\Repo\Content\DataUpdateAdapter::doUpdate in /srv/mediawiki/php-1.35.0-wmf.23/extensions/Wikibase/repo/includes/Content/DataUpdateAdapter.php at line 62] to Fail-safe exception from ReplicaMasterAwareRecordIdsAcquirer via Wikibase DataUpdateAdapter::doUpdate.Mar 18 2020, 3:45 PM
Krinkle moved this task from Untriaged to Mar2020/1.35-wmf.22 on the Wikimedia-production-error board.

This is actually two different errors. I make another one for the new one.

I think I found out what's happening here. I tracked down one case of it happening which is this edit: https://www.wikidata.org/w/index.php?diff=1138568411 causing this: https://logstash.wikimedia.org/app/kibana#/doc/logstash-*/logstash-mediawiki-2020.03.19/mediawiki/?id=AXDzmOx6KWrIH1QRGQJO

Why? This edit happening almost at the same time: https://www.wikidata.org/w/index.php?diff=1138568386

Both edits introduce a new text ("Biblioteca Pública Municipal no município de Muquém de São Francisco") for the term store at the same time and the code tries to read it, it can't (because it doesn't exist yet) then it tries to insert it, it can't (because the other thread already put it there) then it tries to read it again (because it couldn't insert, right?) but it can't find it which seems like a very logical thing to do to throw a logic error (narrator: it's not).

The reason that the second read can't read the ghost row is MySQL's default locking level: "REPEATABLE_READ". I guess that's explanatory enough. when you read from master from one transaction, MySQL gives you the old version due to MVCC but it doesn't let you insert it either.

I don't know what would be the right solution right now but I will find it.

Change 582159 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/extensions/Wikibase@master] wbterms: When reaching an edge case, try reading master with CONN_TRX_AUTOCOMMIT

https://gerrit.wikimedia.org/r/582159

Change 582159 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] wbterms: When reaching an edge case, try reading master with CONN_TRX_AUTOCOMMIT

https://gerrit.wikimedia.org/r/582159