Page MenuHomePhabricator

Increase in "Main slot of revision [number] not found in database!" after deploy of 1.35.0-wmf.31 to all wikis
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

MediaWiki version: 1.35.0-wmf.30

message
Main slot of revision 955427845 not found in database!

Impact

Instances of this error increased dramatically after rolling the train to group2, to ~30-40 a minute:

Screenshot-2020-05-07-13:14:15.png (196×601 px, 12 KB)

I'm unclear on user-facing impact.

Notes

There's been a trickle of these; I was assuming they fell under an existing bug, probably T212428, but I now suspect I was mistaken.

Uptick caused by a9b073fc6214a03e41262aa605537525b3963cf8 - failures were previously ignored within Revision::getContent (just returned null if encountering a RevisionAccessException)

Details

Request ID
XrReMApAICsAAKJ4DwUAAAAK
Request URL
https://jobrunner.discovery.wmnet/rpc/RunSingleJob.php
Stack Trace
exception.trace
#0 /srv/mediawiki/php-1.35.0-wmf.30/includes/Revision/RevisionStore.php(1188): MediaWiki\Revision\RevisionStore->constructSlotRecords(string, Wikimedia\Rdbms\ResultWrapper, integer, Title)
#1 /srv/mediawiki/php-1.35.0-wmf.30/includes/Revision/RevisionStore.php(1290): MediaWiki\Revision\RevisionStore->loadSlotRecords(string, integer, Title)
#2 [internal function]: MediaWiki\Revision\RevisionStore->MediaWiki\Revision\{closure}()
#3 /srv/mediawiki/php-1.35.0-wmf.30/includes/Revision/RevisionSlots.php(171): call_user_func(Closure)
#4 /srv/mediawiki/php-1.35.0-wmf.30/includes/Revision/RevisionRenderer.php(209): MediaWiki\Revision\RevisionSlots->getSlots()
#5 /srv/mediawiki/php-1.35.0-wmf.30/includes/Revision/RevisionRenderer.php(152): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, array)
#6 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}(MediaWiki\Revision\RenderedRevision, array)
#7 /srv/mediawiki/php-1.35.0-wmf.30/includes/Revision/RenderedRevision.php(198): call_user_func(Closure, MediaWiki\Revision\RenderedRevision, array)
#8 /srv/mediawiki/php-1.35.0-wmf.30/includes/jobqueue/jobs/CategoryMembershipChangeJob.php(274): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#9 /srv/mediawiki/php-1.35.0-wmf.30/includes/jobqueue/jobs/CategoryMembershipChangeJob.php(248): CategoryMembershipChangeJob->getCategoriesAtRev(WikiPage, MediaWiki\Revision\RevisionStoreRecord, string)
#10 /srv/mediawiki/php-1.35.0-wmf.30/includes/jobqueue/jobs/CategoryMembershipChangeJob.php(205): CategoryMembershipChangeJob->getExplicitCategoriesChanges(WikiPage, MediaWiki\Revision\RevisionStoreRecord, MediaWiki\Revision\RevisionStoreRecord)
#11 /srv/mediawiki/php-1.35.0-wmf.30/includes/jobqueue/jobs/CategoryMembershipChangeJob.php(169): CategoryMembershipChangeJob->notifyUpdatesForRevision(Wikimedia\Rdbms\LBFactoryMulti, WikiPage, MediaWiki\Revision\RevisionStoreRecord)
#12 /srv/mediawiki/php-1.35.0-wmf.30/extensions/EventBus/includes/JobExecutor.php(80): CategoryMembershipChangeJob->run()
#13 /srv/mediawiki/rpc/RunSingleJob.php(76): MediaWiki\Extension\EventBus\JobExecutor->execute(array)
#14 {main}

Event Timeline

brennen triaged this task as Unbreak Now! priority.May 7 2020, 7:24 PM
brennen added a project: User-brennen.
brennen added a subscriber: Pchelolo.

Marking as a train blocker. Also cc: @Pchelolo.

@brennen I don't think I caused this one:

Thanks for the rapid investigation. After skimming comments on T212428 again, maybe this does just duplicate that bug but something's driving the error rate higher. (Maybe any artifact of the deploy? Didn't seem to be slowing down, however.) Curious if @daniel or @BPirkle have any thoughts on that.

The stack trace given is from .30, as is the error version given - is anything different in .31?

I think this is just more of T212428, yes; also seeing on wmf.31.

The stack trace given is from .30, as is the error version given - is anything different in .31?

Gah. Good catch. Here's one from .31:

URL: https://ru.wikipedia.org/w/index.php?title=I_Fight_Dragons&action=submit

reqId: e1eab171-11be-4125-a04f-2cce60f8ad26

trace
[e1eab171-11be-4125-a04f-2cce60f8ad26] /w/index.php?title=I_Fight_Dragons&action=submit   MediaWiki\Revision\RevisionAccessException from line 1251 of /srv/mediawiki/php-1.35.0-wmf.31/includes/Revision/RevisionStore.php: Main slot of revision 106882090 not found in database!
#0 /srv/mediawiki/php-1.35.0-wmf.31/includes/Revision/RevisionStore.php(1188): MediaWiki\Revision\RevisionStore->constructSlotRecords(string, Wikimedia\Rdbms\ResultWrapper, integer, Title)
#1 /srv/mediawiki/php-1.35.0-wmf.31/includes/Revision/RevisionStore.php(1290): MediaWiki\Revision\RevisionStore->loadSlotRecords(string, integer, Title)
#2 [internal function]: MediaWiki\Revision\RevisionStore->MediaWiki\Revision\{closure}()
#3 /srv/mediawiki/php-1.35.0-wmf.31/includes/Revision/RevisionSlots.php(171): call_user_func(Closure)
#4 /srv/mediawiki/php-1.35.0-wmf.31/includes/Revision/RevisionSlots.php(113): MediaWiki\Revision\RevisionSlots->getSlots()
#5 /srv/mediawiki/php-1.35.0-wmf.31/includes/Revision/RevisionRecord.php(192): MediaWiki\Revision\RevisionSlots->getSlot(string)
#6 /srv/mediawiki/php-1.35.0-wmf.31/includes/Revision/RevisionRecord.php(175): MediaWiki\Revision\RevisionRecord->getSlot(string, integer, NULL)
#7 /srv/mediawiki/php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaCommon/TitleLibrary.php(299): MediaWiki\Revision\RevisionRecord->getContent(string)
#8 /srv/mediawiki/php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaCommon/TitleLibrary.php(310): Scribunto_LuaTitleLibrary->getContentInternal(string)
#9 /srv/mediawiki/php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaSandbox/LuaSandboxCallback.php(26): Scribunto_LuaTitleLibrary->getContent(string)
#10 [internal function]: Scribunto_LuaSandboxCallback->__call(string, array)
#11 /srv/mediawiki/php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaSandbox/LuaSandboxInterpreter.php(113): LuaSandboxFunction->call(LuaSandboxFunction)
#12 /srv/mediawiki/php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaCommon/LuaEngine.php(291): Scribunto_LuaSandboxInterpreter->callFunction(LuaSandboxFunction, LuaSandboxFunction)
#13 /srv/mediawiki/php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaCommon/LuaModule.php(68): Scribunto_LuaEngine->executeFunctionChunk(LuaSandboxFunction, PPTemplateFrame_Hash)
#14 /srv/mediawiki/php-1.35.0-wmf.31/extensions/Scribunto/includes/common/Hooks.php(128): Scribunto_LuaModule->invoke(string, PPTemplateFrame_Hash)
#15 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3322): ScribuntoHooks::invokeHook(Parser, PPTemplateFrame_Hash, array)
#16 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3026): Parser->callParserFunction(PPTemplateFrame_Hash, string, array)
#17 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(253): Parser->braceSubstitution(array, PPTemplateFrame_Hash)
#18 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3204): PPFrame_Hash->expand(PPNode_Hash_Tree)
#19 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(253): Parser->braceSubstitution(array, PPTemplateFrame_Hash)
#20 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPTemplateFrame_Hash.php(152): PPFrame_Hash->expand(PPNode_Hash_Tree, integer)
#21 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPTemplateFrame_Hash.php(179): PPTemplateFrame_Hash->getNumberedArgument(string)
#22 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3775): PPTemplateFrame_Hash->getArgument(string)
#23 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(270): Parser->argSubstitution(array, PPTemplateFrame_Hash)
#24 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(2943): PPFrame_Hash->expand(PPNode_Hash_Tree)
#25 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(253): Parser->braceSubstitution(array, PPTemplateFrame_Hash)
#26 /srv/mediawiki/php-1.35.0-wmf.31/extensions/ParserFunctions/includes/ParserFunctions.php(122): PPFrame_Hash->expand(PPNode_Hash_Tree)
#27 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3322): MediaWiki\Extensions\ParserFunctions\ParserFunctions::if(Parser, PPTemplateFrame_Hash, array)
#28 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3026): Parser->callParserFunction(PPTemplateFrame_Hash, string, array)
#29 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(253): Parser->braceSubstitution(array, PPTemplateFrame_Hash)
#30 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3204): PPFrame_Hash->expand(PPNode_Hash_Tree)
#31 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(253): Parser->braceSubstitution(array, PPTemplateFrame_Hash)
#32 /srv/mediawiki/php-1.35.0-wmf.31/extensions/ParserFunctions/includes/ParserFunctions.php(122): PPFrame_Hash->expand(PPNode_Hash_Tree)
#33 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3322): MediaWiki\Extensions\ParserFunctions\ParserFunctions::if(Parser, PPTemplateFrame_Hash, array)
#34 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3026): Parser->callParserFunction(PPTemplateFrame_Hash, string, array)
#35 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(253): Parser->braceSubstitution(array, PPTemplateFrame_Hash)
#36 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(3204): PPFrame_Hash->expand(PPNode_Hash_Tree)
#37 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/PPFrame_Hash.php(253): Parser->braceSubstitution(array, PPFrame_Hash)
#38 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(2867): PPFrame_Hash->expand(PPNode_Hash_Tree, integer)
#39 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(1553): Parser->replaceVariables(string)
#40 /srv/mediawiki/php-1.35.0-wmf.31/includes/parser/Parser.php(640): Parser->internalParse(string)
#41 /srv/mediawiki/php-1.35.0-wmf.31/includes/content/WikitextContent.php(368): Parser->parse(string, Title, ParserOptions, boolean, boolean, NULL)
#42 /srv/mediawiki/php-1.35.0-wmf.31/includes/content/AbstractContent.php(565): WikitextContent->fillParserOutput(Title, NULL, ParserOptions, boolean, ParserOutput)
#43 /srv/mediawiki/php-1.35.0-wmf.31/extensions/PageImages/includes/Hooks/LinksUpdateHookHandler.php(77): AbstractContent->getParserOutput(Title)
#44 /srv/mediawiki/php-1.35.0-wmf.31/extensions/PageImages/includes/Hooks/LinksUpdateHookHandler.php(92): PageImages\Hooks\LinksUpdateHookHandler->getPageImageCandidates(LinksUpdate)
#45 /srv/mediawiki/php-1.35.0-wmf.31/extensions/PageImages/includes/Hooks/LinksUpdateHookHandler.php(35): PageImages\Hooks\LinksUpdateHookHandler->doLinksUpdate(LinksUpdate)
#46 /srv/mediawiki/php-1.35.0-wmf.31/includes/HookContainer/HookContainer.php(289): PageImages\Hooks\LinksUpdateHookHandler::onLinksUpdate(LinksUpdate)
#47 /srv/mediawiki/php-1.35.0-wmf.31/includes/HookContainer/HookContainer.php(129): MediaWiki\HookContainer\HookContainer->callLegacyHook(string, array, array, array)
#48 /srv/mediawiki/php-1.35.0-wmf.31/includes/Hooks.php(136): MediaWiki\HookContainer\HookContainer->run(string, array, array)
#49 /srv/mediawiki/php-1.35.0-wmf.31/includes/deferred/LinksUpdate.php(194): Hooks::run(string, array)
#50 /srv/mediawiki/php-1.35.0-wmf.31/includes/deferred/DeferredUpdates.php(417): LinksUpdate->doUpdate()
#51 /srv/mediawiki/php-1.35.0-wmf.31/includes/deferred/RefreshSecondaryDataUpdate.php(101): DeferredUpdates::attemptUpdate(LinksUpdate, Wikimedia\Rdbms\LBFactoryMulti)
#52 /srv/mediawiki/php-1.35.0-wmf.31/includes/deferred/DeferredUpdates.php(417): RefreshSecondaryDataUpdate->doUpdate()
#53 /srv/mediawiki/php-1.35.0-wmf.31/includes/deferred/DeferredUpdates.php(296): DeferredUpdates::attemptUpdate(RefreshSecondaryDataUpdate, Wikimedia\Rdbms\LBFactoryMulti)
#54 /srv/mediawiki/php-1.35.0-wmf.31/includes/deferred/DeferredUpdates.php(233): DeferredUpdates::run(RefreshSecondaryDataUpdate, Wikimedia\Rdbms\LBFactoryMulti, Monolog\Logger, BufferingStatsdDataFactory, string)
#55 /srv/mediawiki/php-1.35.0-wmf.31/includes/deferred/DeferredUpdates.php(150): DeferredUpdates::handleUpdateQueue(array, string, integer)
#56 /srv/mediawiki/php-1.35.0-wmf.31/includes/MediaWiki.php(1059): DeferredUpdates::doUpdates(string)
#57 /srv/mediawiki/php-1.35.0-wmf.31/includes/MediaWiki.php(839): MediaWiki->restInPeace()
#58 /srv/mediawiki/php-1.35.0-wmf.31/includes/MediaWiki.php(848): MediaWiki->{closure}()
#59 /srv/mediawiki/php-1.35.0-wmf.31/includes/MediaWiki.php(574): MediaWiki->doPostOutputShutdown()
#60 /srv/mediawiki/php-1.35.0-wmf.31/index.php(47): MediaWiki->run()
#61 /srv/mediawiki/w/index.php(3): require(string)
#62 {main}

This seems to be the code path for most (maybe all?) of the .31 instances.

That makes more sense. This one's to blame: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Scribunto/+/589699

We were using Revision::getContent before, using RevisionRecord::getContent now. This is a very simple substitution, with one important caveat: Revision was surrounding RevisionRecord::getContent with try/catch and was ignoring the exception we're seeing now.

The problem we are starting to see now existed before, it was just silently ignored. I have found the revision mentioned in the example trace in the database, and it exist, and we *should* be able to get it's content.

As an immediate fix I propose to return the try/catch back, but instead of silently ignoring the error, log it for further investigation, so once this is done, we can close this task and open a new one for more investigation of the root cause.

Change 595049 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/extensions/Scribunto@master] Handle RevisionAccessException with try-catch

https://gerrit.wikimedia.org/r/595049

Change 595054 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/extensions/Scribunto@wmf/1.35.0-wmf.31] Handle RevisionAccessException with try-catch

https://gerrit.wikimedia.org/r/595054

Change 595049 merged by jenkins-bot:
[mediawiki/extensions/Scribunto@master] Handle RevisionAccessException with try-catch

https://gerrit.wikimedia.org/r/595049

Change 595054 merged by jenkins-bot:
[mediawiki/extensions/Scribunto@wmf/1.35.0-wmf.31] Handle RevisionAccessException with try-catch

https://gerrit.wikimedia.org/r/595054

The problem we are starting to see now existed before, it was just silently ignored. I have found the revision mentioned in the example trace in the database, and it exist, and we *should* be able to get it's content.

This is typically due to a race condition. When loading revision content right after the revision has been created, the READ_LATEST must be passed.

I'm starting to wonder whether RevisionStore should just keep a list of the revisions it just created, and just force any read for them to go through master.

Mentioned in SAL (#wikimedia-operations) [2020-05-07T22:31:19Z] <brennen@deploy1001> Synchronized php-1.35.0-wmf.31/extensions/Scribunto/includes/engines/LuaCommon/TitleLibrary.php: [[gerrit:595054|Handle RevisionAccessException with try-catch (T252156)]] (duration: 01m 08s)

brennen lowered the priority of this task from Unbreak Now! to Needs Triage.May 7 2020, 11:09 PM

Removing as a train blocker, since it seems fixed from that perspective.

Change 595147 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] RevisionStore: fall back to master db if main slot is missing.

https://gerrit.wikimedia.org/r/595147

daniel triaged this task as Medium priority.May 8 2020, 10:07 AM

Change 595147 merged by jenkins-bot:
[mediawiki/core@master] RevisionStore: fall back to master db if main slot is missing.

https://gerrit.wikimedia.org/r/595147

Change 595596 had a related patch set uploaded (by Jforrester; owner: Daniel Kinzler):
[mediawiki/core@wmf/1.35.0-wmf.31] RevisionStore: fall back to master db if main slot is missing.

https://gerrit.wikimedia.org/r/595596

Change 595596 merged by jenkins-bot:
[mediawiki/core@wmf/1.35.0-wmf.31] RevisionStore: fall back to master db if main slot is missing.

https://gerrit.wikimedia.org/r/595596

Mentioned in SAL (#wikimedia-operations) [2020-05-11T18:11:51Z] <jforrester@deploy1001> Synchronized php-1.35.0-wmf.31/includes/Revision/RevisionStore.php: T252156 T212428 RevisionStore: fall back to master db if main slot is missing (duration: 00m 58s)

Jdforrester-WMF assigned this task to daniel.

We're still getting some of these, but many fewer; closing this task.