Page MenuHomePhabricator

includes/Revision/RevisionStore.php: Main slot of revision (number) not found in database!
Open, MediumPublic

Description

Noticed in mediawiki-errors for both wmf.8 an wmf.9

[XBuw4QpAICMAALTXkHkAAABG] /w/index.php?title=Speci%C3%A1lis:Lap_%C3%A1tnevez%C3%A9se&action=submit   MediaWiki\Revision\RevisionAccessException from line 1643 of /srv/mediawiki/php-1.33.0-wmf.8/includes/Revision/RevisionStore.php: Main slot of revision 20798212 not found in database!
[XBuxngpAIDgAAJFITF8AAADO] /w/api.php?rvprop=userid%7Cuser%7Cids%7Ccontent%7Csize%7Ctimestamp%7Ccontentmodel%7Ccomment&revids=815943409&prop=revisions&format=json&rvslots=main&action=query   MediaWiki\Revision\RevisionAccessException from line 1643 of /srv/mediawiki/php-1.33.0-wmf.9/includes/Revision/RevisionStore.php: Main slot of revision 815943409 not found in database!

Impact

Through the API, querying information about some pages results in an internal_api_error response.

Details

Related Gerrit Patches:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 20 2018, 3:28 PM
daniel added a subscriber: daniel.Jan 15 2019, 4:03 PM

To go out on a limb here, this looks like it's caused by replication lag: some bit of code is trying to load the vision from a replica right after it has been saved, and it hasn't been replicated yet. This can be avoided by a) falling back to loading from master if the revision isn't found on the replica, and b) making the RevisionRecord that was created when the edit was made directly available to the code that needs it. (b) is preferable, but often rather hard to do.

Option "a" could cause an outage on the master if for any reason all the replicas are lagging (doesn't happen often but could happen), if it defaults back to the master, we could overload it and make the problem worse :-(

daniel added a comment.EditedJan 15 2019, 4:13 PM

Option "a" could cause an outage on the master if for any reason all the replicas are lagging

True, this should not be done on a code path that may be triggered on a regular page view. In code that is only used on edits or more unusual circumstances, it should be fine, I think.

We could also do hacky hybrid stuff like "fall back to master if in a POST request, fail in a GET reqest"...

Two examples from exception.log:

2019-01-15 08:46:59 [XD2eAwpAIDMAAJXcBsUAAAAK] mw1339 dewiki 1.33.0-wmf.12 exception ERROR: [XD2eAwpAIDMAAJXcBsUAAAAK] /w/api.php?format=json&rvslots=main&revids=184746892&rvprop=contentmodel%7Cids%7Cuserid%7Ccontent%7Ctimestamp%7Csize%7Ccomment%7Cuser&prop=revisions&action=query   MediaWiki\Revision\RevisionAccessException from line 1643 of /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php: Main slot of revision 184746892 not found in database! {"exception_id":"XD2eAwpAIDMAAJXcBsUAAAAK","exception_url":"/w/api.php?format=json&rvslots=main&revids=184746892&rvprop=contentmodel%7Cids%7Cuserid%7Ccontent%7Ctimestamp%7Csize%7Ccomment%7Cuser&prop=revisions&action=query","caught_by":"mwe_handler"}
[Exception MediaWiki\Revision\RevisionAccessException] (/srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php:1643) Main slot of revision 184746892 not found in database!
  #0 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php(1677): MediaWiki\Revision\RevisionStore->loadSlotRecords(string, integer)
  #1 [internal function]: Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots()
  #2 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionSlots.php(165): call_user_func(Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots;559)
  #3 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionSlots.php(136): MediaWiki\Revision\RevisionSlots->getSlots()
  #4 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionRecord.php(219): MediaWiki\Revision\RevisionSlots->getSlotRoles()
  #5 /srv/mediawiki/php-1.33.0-wmf.12/includes/api/ApiQueryRevisionsBase.php(332): MediaWiki\Revision\RevisionRecord->getSlotRoles()
  #6 /srv/mediawiki/php-1.33.0-wmf.12/includes/api/ApiQueryRevisions.php(420): ApiQueryRevisionsBase->extractRevisionInfo(MediaWiki\Revision\RevisionStoreRecord, stdClass)
  #7 /srv/mediawiki/php-1.33.0-wmf.12/includes/api/ApiQueryRevisionsBase.php(58): ApiQueryRevisions->run()
  #8 /srv/mediawiki/php-1.33.0-wmf.12/includes/api/ApiQuery.php(249): ApiQueryRevisionsBase->execute()
  #9 /srv/mediawiki/php-1.33.0-wmf.12/includes/api/ApiMain.php(1596): ApiQuery->execute()
  #10 /srv/mediawiki/php-1.33.0-wmf.12/includes/api/ApiMain.php(531): ApiMain->executeAction()
  #11 /srv/mediawiki/php-1.33.0-wmf.12/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
  #12 /srv/mediawiki/php-1.33.0-wmf.12/api.php(87): ApiMain->execute()
  #13 /srv/mediawiki/w/api.php(3): include(string)
  #14 {main}
2019-01-15 09:49:30 [XD2sqApAAEQAAIR@SkUAAAAQ] mw1273 huwiki 1.33.0-wmf.12 exception ERROR: [XD2sqApAAEQAAIR@SkUAAAAQ] /w/index.php?title=Speci%C3%A1lis:Lap_%C3%A1tnevez%C3%A9se&action=submit   MediaWiki\Revision\RevisionAccessException from line 1643 of /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php: Main slot of revision 20883806 not found in database! {"exception_id":"XD2sqApAAEQAAIR@SkUAAAAQ","exception_url":"/w/index.php?title=Speci%C3%A1lis:Lap_%C3%A1tnevez%C3%A9se&action=submit","caught_by":"mwe_handler"}
[Exception MediaWiki\Revision\RevisionAccessException] (/srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php:1643) Main slot of revision 20883806 not found in database!
  #0 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php(1677): MediaWiki\Revision\RevisionStore->loadSlotRecords(string, integer)
  #1 [internal function]: Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots()
  #2 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionSlots.php(165): call_user_func(Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots;556)
  #3 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionSlots.php(136): MediaWiki\Revision\RevisionSlots->getSlots()
  #4 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionRecord.php(219): MediaWiki\Revision\RevisionSlots->getSlotRoles()
  #5 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/MutableRevisionRecord.php(57): MediaWiki\Revision\RevisionRecord->getSlotRoles()
  #6 /srv/mediawiki/php-1.33.0-wmf.12/includes/Storage/DerivedPageDataUpdater.php(773): MediaWiki\Revision\MutableRevisionRecord::newFromParentRevision(MediaWiki\Revision\RevisionStoreRecord)
  #7 /srv/mediawiki/php-1.33.0-wmf.12/includes/page/WikiPage.php(1998): MediaWiki\Storage\DerivedPageDataUpdater->prepareContent(User, MediaWiki\Storage\RevisionSlotsUpdate, boolean)
  #8 /srv/mediawiki/php-1.33.0-wmf.12/extensions/FlaggedRevs/backend/FlaggedRevs.class.php(1038): WikiPage->prepareContentForEdit(WikitextContent, NULL, User, string)
  #9 /srv/mediawiki/php-1.33.0-wmf.12/extensions/FlaggedRevs/backend/FlaggedRevs.hooks.php(92): FlaggedRevs::autoReviewEdit(FlaggableWikiPage, User, Revision)
  #10 /srv/mediawiki/php-1.33.0-wmf.12/includes/Hooks.php(174): FlaggedRevsHooks::onTitleMoveComplete(Title, Title, User, integer, integer, string, Revision)
  #11 /srv/mediawiki/php-1.33.0-wmf.12/includes/Hooks.php(202): Hooks::callHook(string, array, array, NULL)
  #12 /srv/mediawiki/php-1.33.0-wmf.12/includes/MovePage.php(417): Hooks::run(string, array)
  #13 /srv/mediawiki/php-1.33.0-wmf.12/includes/libs/rdbms/database/Database.php(3806): Closure$MovePage::move(Wikimedia\Rdbms\DatabaseMysqli, string)
  #14 /srv/mediawiki/php-1.33.0-wmf.12/includes/deferred/AtomicSectionUpdate.php(35): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure$MovePage::move;2066)
  #15 /srv/mediawiki/php-1.33.0-wmf.12/includes/deferred/DeferredUpdates.php(270): AtomicSectionUpdate->doUpdate()
  #16 /srv/mediawiki/php-1.33.0-wmf.12/includes/deferred/DeferredUpdates.php(216): DeferredUpdates::runUpdate(AtomicSectionUpdate, Wikimedia\Rdbms\LBFactoryMulti, string, integer)
  #17 /srv/mediawiki/php-1.33.0-wmf.12/includes/deferred/DeferredUpdates.php(140): DeferredUpdates::execute(array, string, integer)
  #18 /srv/mediawiki/php-1.33.0-wmf.12/includes/MediaWiki.php(904): DeferredUpdates::doUpdates(string)
  #19 /srv/mediawiki/php-1.33.0-wmf.12/includes/MediaWiki.php(728): MediaWiki->restInPeace(string, boolean)
  #20 [internal function]: Closure$MediaWiki::doPostOutputShutdown()
  #21 {main}
daniel added a subscriber: Anomie.Jan 15 2019, 9:15 PM

The first instance looks like a race condition including a 3rd party client (bot). E.g. bot makes and edit, and immediately tries to load the revision. Or a bot lists revisions, and then tries to load their content by ID, in a different request, hitting a different replica that is more lagged.

This can't really be avoided, but definitely should not cause a 500. ApiBase or ApiMain should handle this exception gracefully. The questions is, how? Previously, ApiQueryRevisions would have called Revision::getContent(), which would have returned null. Not sure how that was handled further, but we could try to restore that behavior, and add a warning the the API result.

@Anomie may have ideas, I seem to recall he fixed a similar issue a couple of months ago. Or is it the same issue, and the fix was never merged?...

The second instance involves the TitleMoveComplete hook. Moving a page creates a dummy revision, which probably has not yet been replicated to all replicas when the hook handler is called. The hook handler, FlaggedRevsHooks::onTitleMoveComplete, may have to force the master database to be used, via the appropriate query flags. This should be safe, since the hook is only fired in requests that write to master anyway.

First instance from above (the api one) reproduced locally by being evil to my database.

It sounds like the solution in at least these two cases is changes to code that calls the Revision subsystem rather than changes in RevisionStore or related classes. Which suggests that there may be a number of other instances that will need to be identified and addressed individually.

The first instance looks like a race condition including a 3rd party client (bot). E.g. bot makes and edit, and immediately tries to load the revision. Or a bot lists revisions, and then tries to load their content by ID, in a different request, hitting a different replica that is more lagged.

Yes, either of those sounds likely.

Although, the inserts into revision, slots, and content are done in a transaction, so I'd think they should all become visible at the same time? And ApiQueryRevisions doesn't use any locking that would bypass the isolation. There must be something non-obvious to me about how exactly the transaction isolation works.

This can't really be avoided, but definitely should not cause a 500. ApiBase or ApiMain should handle this exception gracefully.

What makes you think it's causing a 500? The exception was likely caught at ApiMain line 537, and passed to handleException which logs it on line 571 and goes on to return an api_internal_error code.

The questions is, how? Previously, ApiQueryRevisions would have called Revision::getContent(), which would have returned null. Not sure how that was handled further, but we could try to restore that behavior, and add a warning the the API result.

A failure in calling ->getContent() did and still does result in it setting a 'missing' flag in the response. But the failure here isn't happening on a getContent() call, it's happening on a call to getSlotRoles() (which isn't documented as throwing exceptions).

My local results for the api instance match what @Anomie described: a "200 OK" with a body including an api_internal_error code, as follows:

{
    "error": {
        "code": "internal_api_error_MediaWiki\\Revision\\RevisionAccessException",
        "info": "[feac4d9d86e5f334a3b8e596] Exception caught: Main slot of revision 5 not found in database!",
        "errorclass": "MediaWiki\\Revision\\RevisionAccessException",
        "*": "MediaWiki\\Revision\\RevisionAccessException at /vagrant/mediawiki/includes/Revision/RevisionStore.php(1641)\n#0 /vagrant/mediawiki/includes/Revision/RevisionStore.php(1677): MediaWiki\\Revision\\RevisionStore->loadSlotRecords(string, integer)\n#1 [internal function]: MediaWiki\\Revision\\RevisionStore->MediaWiki\\Revision\\{closure}()\n#2 /vagrant/mediawiki/includes/Revision/RevisionSlots.php(165): call_user_func(Closure)\n#3 /vagrant/mediawiki/includes/Revision/RevisionSlots.php(136): MediaWiki\\Revision\\RevisionSlots->getSlots()\n#4 /vagrant/mediawiki/includes/Revision/RevisionRecord.php(219): MediaWiki\\Revision\\RevisionSlots->getSlotRoles()\n#5 /vagrant/mediawiki/includes/api/ApiQueryRevisionsBase.php(332): MediaWiki\\Revision\\RevisionRecord->getSlotRoles()\n#6 /vagrant/mediawiki/includes/api/ApiQueryRevisions.php(420): ApiQueryRevisionsBase->extractRevisionInfo(MediaWiki\\Revision\\RevisionStoreRecord, stdClass)\n#7 /vagrant/mediawiki/includes/api/ApiQueryRevisionsBase.php(58): ApiQueryRevisions->run()\n#8 /vagrant/mediawiki/includes/api/ApiQuery.php(249): ApiQueryRevisionsBase->execute()\n#9 /vagrant/mediawiki/includes/api/ApiMain.php(1596): ApiQuery->execute()\n#10 /vagrant/mediawiki/includes/api/ApiMain.php(531): ApiMain->executeAction()\n#11 /vagrant/mediawiki/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()\n#12 /vagrant/mediawiki/api.php(87): ApiMain->execute()\n#13 {main}"
    },
    "servedby": "stretch"
}

Error and stack trace follows for a different failure (Failed to load blob) that generates the same exception. I've reviewed logs and thus far have found only these three types of failures (the one below, plus the two above) resulting in a RevisionAccessException. All three failure types occur multiple times in the logs.

2019-01-15 09:53:13 [XD2tiApAAD0AAEcJ8wsAAAAR] mw1266 zhwiki 1.33.0-wmf.12 exception ERROR: [XD2tiApAAD0AAEcJ8wsAAAAR] /zh-cn/Delon_Thamrin   MediaWiki\Revision\RevisionAccessException from line 1465 of /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php: Failed to load data blob from tt:9375723: Failed to load blob from address tt:9375723 {"exception_id":"XD2tiApAAD0AAEcJ8wsAAAAR","exception_url":"/zh-cn/Delon_Thamrin","caught_by":"mwe_handler"}
[Exception MediaWiki\Revision\RevisionAccessException] (/srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php:1465) Failed to load data blob from tt:9375723: Failed to load blob from address tt:9375723
  #0 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php(1634): MediaWiki\Revision\RevisionStore->loadSlotContent(MediaWiki\Revision\SlotRecord, NULL, NULL, NULL, integer)
  #1 [internal function]: Closure$MediaWiki\Revision\RevisionStore::loadSlotRecords(MediaWiki\Revision\SlotRecord)
  #2 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/SlotRecord.php(307): call_user_func(Closure$MediaWiki\Revision\RevisionStore::loadSlotRecords;555, MediaWiki\Revision\SlotRecord)
  #3 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionRecord.php(175): MediaWiki\Revision\SlotRecord->getContent()
  #4 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RenderedRevision.php(226): MediaWiki\Revision\RevisionRecord->getContent(string, integer, NULL)
  #5 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionRenderer.php(193): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string)
  #6 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionRenderer.php(142): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, array)
  #7 [internal function]: Closure$MediaWiki\Revision\RevisionRenderer::getRenderedRevision#2(MediaWiki\Revision\RenderedRevision, array)
  #8 /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RenderedRevision.php(197): call_user_func(Closure$MediaWiki\Revision\RevisionRenderer::getRenderedRevision#2;1052, MediaWiki\Revision\RenderedRevision, array)
  #9 /srv/mediawiki/php-1.33.0-wmf.12/includes/poolcounter/PoolWorkArticleView.php(194): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
  #10 /srv/mediawiki/php-1.33.0-wmf.12/includes/poolcounter/PoolCounterWork.php(123): PoolWorkArticleView->doWork()
  #11 /srv/mediawiki/php-1.33.0-wmf.12/includes/page/Article.php(774): PoolCounterWork->execute()
  #12 /srv/mediawiki/php-1.33.0-wmf.12/includes/actions/ViewAction.php(68): Article->view()
  #13 /srv/mediawiki/php-1.33.0-wmf.12/includes/MediaWiki.php(501): ViewAction->show()
  #14 /srv/mediawiki/php-1.33.0-wmf.12/includes/MediaWiki.php(294): MediaWiki->performAction(Article, Title)
  #15 /srv/mediawiki/php-1.33.0-wmf.12/includes/MediaWiki.php(862): MediaWiki->performRequest()
  #16 /srv/mediawiki/php-1.33.0-wmf.12/includes/MediaWiki.php(517): MediaWiki->main()
  #17 /srv/mediawiki/php-1.33.0-wmf.12/index.php(42): MediaWiki->run()
  #18 /srv/mediawiki/w/index.php(3): include(string)
  #19 {main}
Krinkle updated the task description. (Show Details)Mar 19 2019, 6:38 PM
Restricted Application added a subscriber: Cosine02. · View Herald TranscriptMar 19 2019, 6:38 PM

What can we do in the short term to restore the ability for users to read and contribute to these pages?

2019-01-15 09:53:13 [XD2tiApAAD0AAEcJ8wsAAAAR] mw1266 zhwiki 1.33.0-wmf.12 exception ERROR: [XD2tiApAAD0AAEcJ8wsAAAAR] /zh-cn/Delon_Thamrin   MediaWiki\Revision\RevisionAccessException from line 1465 of /srv/mediawiki/php-1.33.0-wmf.12/includes/Revision/RevisionStore.php: Failed to load data blob from tt:9375723: Failed to load blob from address tt:9375723 {"exception_id":"XD2tiApAAD0AAEcJ8wsAAAAR","exception_url":"/zh-cn/Delon_Thamrin","caught_by":"mwe_handler"}

That's T205936: Unable to view some pages due to fatal RevisionAccessException: "Failed to load data blob from tt", not this task.

  • Readers are unable to view the https://zh.wikipedia.org/zh-cn/Delon_Thamrin page on Chinese Wikipedia (user is shown an unhelpful system error; triggers HTTP 500 Internal Server Error). This also affects certain pages on German Wikipedia (dewiki) and Hungarian Wikipedia (huwiki)

When I try accessing the zhwiki page, I get an exception that is not related to this task. It's T205936: Unable to view some pages due to fatal RevisionAccessException: "Failed to load data blob from tt". Without examples of the other pages, I can't speak to them.

  • Contributors are unable to rename these pages (e.g. via Special:MovePage).

Can't test since I apparently don't have the ability to move pages on zhwiki. But if I screw up my local wiki's database in a similar manner, the error is again due to T205936 rather than this task.

As noted above, the move-related error in this task seems likely to be a race condition or an issue with FlaggedRevs using a replica (lagged, or before transaction commit).

  • Through the API, querying information about these pages results in an internal_api_error response.

Unable to reproduce. https://zh.wikipedia.org/w/api.php?action=query&titles=Delon%20Thamrin&prop=revisions&rvslots=main&rvprop=content|user|comment|timestamp|ids gives me a successful (if not very useful) response consistent with T205936.

As noted above, the API error in this task seems likely to be a race condition.

What can we do in the short term to restore the ability for users to read and contribute to these pages?

As noted in T205936, the data for the one page you mentioned seems to have never been saved in the first place due to an outage back in 2009. There are a few options, but that would be better discussed on that task rather than this one.

Krinkle closed this task as Resolved.Mar 19 2019, 7:08 PM
Krinkle updated the task description. (Show Details)

Thanks. Looks like the API error has been resolved since then. Continuing at T205936.

Krinkle reopened this task as Open.Mar 19 2019, 7:15 PM

Spoke too soon. While the original example no longer fails, the issue still exists. Logstash recorded 1,256 instances in the past 7 days. Here's a sample:

  • Request ID: XJE8gApAEDIAAI2HKwEAAAGU
  • Request URL: HTTP GET hu.wikipedia.org/w/api.php?format=json&prop=revisions&revids=21096755&rvslots=main&rvprop=timestamp%7Cuser%7Cuserid%7Ccontentmodel%7Csize%7Ccontent%7Cids%7Ccomment&action=query
MediaWiki\Revision\RevisionAccessException from line 1643 of /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionStore.php: Main slot of revision 21096755 not found in database!

#0 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionStore.php(1677): MediaWiki\Revision\RevisionStore->loadSlotRecords(string, integer)
#1 [internal function]: Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots()
#2 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionSlots.php(165): call_user_func(Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots;563)
#3 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionSlots.php(136): MediaWiki\Revision\RevisionSlots->getSlots()
#4 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionRecord.php(219): MediaWiki\Revision\RevisionSlots->getSlotRoles()
#5 /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiQueryRevisionsBase.php(332): MediaWiki\Revision\RevisionRecord->getSlotRoles()
#6 /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiQueryRevisions.php(420): ApiQueryRevisionsBase->extractRevisionInfo(MediaWiki\Revision\RevisionStoreRecord, stdClass)
#7 /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiQueryRevisionsBase.php(58): ApiQueryRevisions->run()
#8 /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiQuery.php(249): ApiQueryRevisionsBase->execute()
#9 /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiMain.php(1595): ApiQuery->execute()
#10 /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiMain.php(531): ApiMain->executeAction()
#11 /srv/mediawiki/php-1.33.0-wmf.21/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#12 /srv/mediawiki/php-1.33.0-wmf.21/api.php(87): ApiMain->execute()

I can't reproduce it as-is, so might be a race condition. There plenty of other instances of it in the logs.

daniel added a comment.EditedMar 19 2019, 9:26 PM

I'm seeing an instance triggered by MovePage (/w/index.php?title=Spezial:Verschieben&action=submit).

#0 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionStore.php(1677): MediaWiki\Revision\RevisionStore->loadSlotRecords(string, integer)
#1 [internal function]: Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots()
#2 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionSlots.php(165): call_user_func(Closure$MediaWiki\Revision\RevisionStore::newRevisionSlots;560)
#3 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionSlots.php(136): MediaWiki\Revision\RevisionSlots->getSlots()
#4 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/RevisionRecord.php(219): MediaWiki\Revision\RevisionSlots->getSlotRoles()
#5 /srv/mediawiki/php-1.33.0-wmf.21/includes/Revision/MutableRevisionRecord.php(57): MediaWiki\Revision\RevisionRecord->getSlotRoles()
#6 /srv/mediawiki/php-1.33.0-wmf.21/includes/Storage/DerivedPageDataUpdater.php(773): MediaWiki\Revision\MutableRevisionRecord::newFromParentRevision(MediaWiki\Revision\RevisionStoreCacheRecord)
#7 /srv/mediawiki/php-1.33.0-wmf.21/includes/page/WikiPage.php(1999): MediaWiki\Storage\DerivedPageDataUpdater->prepareContent(User, MediaWiki\Storage\RevisionSlotsUpdate, boolean)
#8 /srv/mediawiki/php-1.33.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.class.php(1038): WikiPage->prepareContentForEdit(WikitextContent, NULL, User, string)
#9 /srv/mediawiki/php-1.33.0-wmf.21/extensions/FlaggedRevs/backend/FlaggedRevs.hooks.php(92): FlaggedRevs::autoReviewEdit(FlaggableWikiPage, User, Revision)
#10 /srv/mediawiki/php-1.33.0-wmf.21/includes/Hooks.php(174): FlaggedRevsHooks::onTitleMoveComplete(Title, Title, User, integer, integer, string, Revision)
#11 /srv/mediawiki/php-1.33.0-wmf.21/includes/Hooks.php(202): Hooks::callHook(string, array, array, NULL)
#12 /srv/mediawiki/php-1.33.0-wmf.21/includes/MovePage.php(417): Hooks::run(string, array)
#13 /srv/mediawiki/php-1.33.0-wmf.21/includes/libs/rdbms/database/Database.php(3815): Closure$MovePage::move(Wikimedia\Rdbms\DatabaseMysqli, string)
#14 /srv/mediawiki/php-1.33.0-wmf.21/includes/deferred/AtomicSectionUpdate.php(35): Wikimedia\Rdbms\Database->doAtomicSection(string, Closure$MovePage::move;2065)
#15 /srv/mediawiki/php-1.33.0-wmf.21/includes/deferred/DeferredUpdates.php(273): AtomicSectionUpdate->doUpdate()
#16 /srv/mediawiki/php-1.33.0-wmf.21/includes/deferred/DeferredUpdates.php(219): DeferredUpdates::runUpdate(AtomicSectionUpdate, Wikimedia\Rdbms\LBFactoryMulti, string, integer)
#17 /srv/mediawiki/php-1.33.0-wmf.21/includes/deferred/DeferredUpdates.php(143): DeferredUpdates::execute(array, string, integer)
#18 /srv/mediawiki/php-1.33.0-wmf.21/includes/MediaWiki.php(909): DeferredUpdates::doUpdates(string)
#19 /srv/mediawiki/php-1.33.0-wmf.21/includes/MediaWiki.php(733): MediaWiki->restInPeace(string, boolean)
#20 [internal function]: Closure$MediaWiki::doPostOutputShutdown()
#21 {main}

The revisions for which this happens seem to all (?) be very young when the error occurs:

  • AWmW7m8fNBo9dX1kpBHV: rev 888511323 on enwiki: error timestamp is 2019-03-19T17:09:46, revision timestamp is 2019-03-19T17:09:45.
  • AWmW23yC8aQffZ3Hys6W: rev 114690696 on eswiki: error timestamp is 2019-03-19T16:49:20, revision timestamp is 2019-03-19T16:49:18.
  • AWmWiXHR8aQffZ3HrlTk: rev 888496802 on enwiki: error timestamp is 2019-03-19T15:19:43, revision timestamp is 2019-03-19T15:19:40.

This indicates a transaction problem.

Change 497650 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Stop gap to shut up log spam due to T212428.

https://gerrit.wikimedia.org/r/497650

Change 497650 merged by jenkins-bot:
[mediawiki/core@master] Stop gap to shut up log spam due to T212428.

https://gerrit.wikimedia.org/r/497650

CCicalese_WMF triaged this task as Medium priority.Apr 1 2019, 3:06 PM
CCicalese_WMF added a subscriber: CCicalese_WMF.

We will research the status of this task and make a decision by 4/15/19.

Sorry, I see we never updated the task after researching the status. @BPirkle will look into this again and update this task.

Also, I had incorrectly tagged this as a contractor task, so I have retagged it.

WDoranWMF added a subscriber: WDoranWMF.

The investigation should be completed within 4 weeks of 2019/07/17.

WDoranWMF moved this task from MCR to mop on the Core Platform Team board.Jul 26 2019, 6:41 PM
mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:08 PM

Marking as resolved for prod-error purposes as the bug has been acknowledged in the code and no longer produces errors that suggest stability issues in production (e.g. cause alerts or abort deployments). From what I understand, this is still considered an unresolved bug that should be resolved from a product/user perspective.

Krinkle removed a subscriber: Krinkle.Oct 8 2019, 5:14 PM