Page MenuHomePhabricator

MediaWiki::restInPeace: transaction round 'ActivityUpdateJob::run' still running
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

MediaWiki version: 1.36.0-wmf.18

message
MediaWiki::restInPeace: transaction round 'ActivityUpdateJob::run' still running
exception.trace
#0 /srv/mediawiki/php-1.36.0-wmf.18/includes/MediaWiki.php(1106): Wikimedia\Rdbms\LBFactory->commitMasterChanges(string)
#1 /srv/mediawiki/rpc/RunSingleJob.php(93): MediaWiki->restInPeace()
#2 {main}

Impact

Unsure.

Notes

There are two matches for this in the last week. Both from jobrunner servers, for enwiki.

I've checked the reqId of each separately and found no other errors from the same request, which means it probably isn't a cascading error where something else fatalled before it.

If we are unable to find anything wrong with the ActivityUpdateJob job, then perhaps together with T228911 this could at that time be reason to look for a logic problem in LBFactory state-tracking.

Details

Request ID
fc9f1ee2-b5c8-4872-a030-db3f56796764
Request URL
https://jobrunner.discovery.wmnet/rpc/RunSingleJob.php
Related Changes in Gerrit:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@nnikkhoui As part of knowledge transfer around rdbms and deferred updates, this might be useful:

I think the issue might be different from that.

  • If a hook handler tried to start a new transaction, it would fail right there and report that error from the line of code that started the second (bad) one.
  • If a hook handler started but not ended an atomic section, it would fail at the end of the outer "AutoCommitUpdate" for the fname not matching, and thus be reportd from there.
  • If a hook handler tried to close a transaction without opening one, it would fail right there for the fname not matching.

What we are seeing in this report is from MediaWiki::restInPeace, after the AutoCommitUpdate and hooks have finished running. The report says that this transaction remained open, which means it was not somehow not closed by AutoCommitUpdate, or somehow re-opened again between then and the end of the process.

Happy to talk more about this if/when you like :)

Change 644625 had a related patch set uploaded (by Nikki Nikkhoui; owner: Nikki Nikkhoui):
[mediawiki/extensions/EventBus@master] Rollback open, read-only transactions

https://gerrit.wikimedia.org/r/644625

Change 644625 merged by jenkins-bot:
[mediawiki/extensions/EventBus@master] Rollback open, read-only transactions

https://gerrit.wikimedia.org/r/644625

Krinkle reassigned this task from Nikki to nnikkhoui.
Krinkle moved this task from Untriaged to Nov 2020 on the Wikimedia-production-error board.
Krinkle added a project: Platform Engineering.
Krinkle added a subscriber: Nikki.

Closing, as there have been no new errors since train rolled out this change on December 8th.