Page MenuHomePhabricator

DBUnexpectedError when moving a page with a StructuredDiscussion talkpage
Closed, ResolvedPublic

Description

When trying to move/rename a page at officewiki just now, I got

[34f85d66-83d6-4ac0-9b5c-226a936d1358] 2020-06-16 19:36:26: Fatal exception of type "Wikimedia\Rdbms\DBUnexpectedError"

With a bit of testing, this seems to be related to whether a StructuredDiscussion talkpage exists or not.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
DannyS712 added subscribers: daniel, Pchelolo, DannyS712.

Copied from T255609: Special:MovePage: Invalid atomic section ended (got MovePage::moveUnsafe but expected Flow\BoardMover):

Probably caused by 5f21ebd4c52f5bc9e49c96e35dedb6605dcacee2 as part of T250023: Replace TitleMoveComplet(e|ing) hooks
The TitleMoveCompleting hook is part of the page move`s atomic section, the TitleMoveComplete and the new PageMoveComplete hooks are part of the deferred atomic update.
I originally proposed that two hooks be added to replace these, but @daniel suggested that only one may be needed.

Note: if the flow commit is reverted, 8f7f133ccfaaaa45b7c00835d3d64884bde0c5c3 needs to be reverted first

DannyS712 moved this task from Untriaged to Jun 2020 on the Wikimedia-production-error board.

Not sure if this should be a train blocker (@brennen didn't mark the merged task as one) but it still needs to be addressed quickly

brennen raised the priority of this task from High to Unbreak Now!.Jun 16 2020, 8:03 PM

I wasn't clear if this represented new breakage, but as it does, let's call this a blocker, at least for moving to all wikis.

I could use advice on whether this warrants a rollback.

I'm working odd hours today so it'll be a few more hours before I can look at this. If someone else (e.g. @aaron) is looking at this in the meantime, they should feel free to forge ahead without me. I'll come back to this in a few hours and jump in if nobody beats me to it.

The errors from @Quiddity's request:

[34f85d66-83d6-4ac0-9b5c-226a936d1358] /w/index.php?title=Special:MovePage&action=submit   Wikimedia\Rdbms\DBUnexpectedError from line 4151 of /srv/mediawiki/php-1.35.0-wmf.37/includes/libs/rdbms/database/Database.php: Invalid atomic section ended (got MovePage::moveUnsafe but expected Flow\BoardMover)
#0 /srv/mediawiki/php-1.35.0-wmf.37/includes/MovePage.php(703): Wikimedia\Rdbms\Database->endAtomic(string)
#1 /srv/mediawiki/php-1.35.0-wmf.37/includes/MovePage.php(412): MovePage->moveUnsafe(User, string, boolean, array)
#2 /srv/mediawiki/php-1.35.0-wmf.37/includes/specials/SpecialMovepage.php(773): MovePage->moveIfAllowed(User, string, boolean)
#3 /srv/mediawiki/php-1.35.0-wmf.37/includes/specials/SpecialMovepage.php(137): MovePageForm->doSubmit()
#4 /srv/mediawiki/php-1.35.0-wmf.37/includes/specialpage/SpecialPage.php(580): MovePageForm->execute(NULL)
#5 /srv/mediawiki/php-1.35.0-wmf.37/includes/specialpage/SpecialPageFactory.php(634): SpecialPage->run(NULL)
#6 /srv/mediawiki/php-1.35.0-wmf.37/includes/MediaWiki.php(307): MediaWiki\SpecialPage\SpecialPageFactory->executePath(Title, RequestContext)
#7 /srv/mediawiki/php-1.35.0-wmf.37/includes/MediaWiki.php(986): MediaWiki->performRequest()
#8 /srv/mediawiki/php-1.35.0-wmf.37/includes/MediaWiki.php(543): MediaWiki->main()
#9 /srv/mediawiki/php-1.35.0-wmf.37/index.php(47): MediaWiki->run()
#10 /srv/mediawiki/w/index.php(3): require(string)
#11 {main}

[34f85d66-83d6-4ac0-9b5c-226a936d1358] /w/index.php?title=Special:MovePage&action=submit   InvalidArgumentException from line 140 of /srv/mediawiki/php-1.35.0-wmf.37/includes/deferred/LinksUpdate.php: The Title object yields no ID. Perhaps the page [[User:Quiddity_(WMF)/Sandbox21]] doesn't exist?
#0 /srv/mediawiki/php-1.35.0-wmf.37/includes/Storage/DerivedPageDataUpdater.php(1337): LinksUpdate->__construct(Title, ParserOutput, boolean)
#1 /srv/mediawiki/php-1.35.0-wmf.37/includes/deferred/RefreshSecondaryDataUpdate.php(83): MediaWiki\Storage\DerivedPageDataUpdater->getSecondaryDataUpdates(boolean)
#2 /srv/mediawiki/php-1.35.0-wmf.37/includes/deferred/DeferredUpdates.php(466): RefreshSecondaryDataUpdate->doUpdate()
#3 /srv/mediawiki/php-1.35.0-wmf.37/includes/deferred/DeferredUpdates.php(343): DeferredUpdates::attemptUpdate(RefreshSecondaryDataUpdate, Wikimedia\Rdbms\LBFactoryMulti)
#4 /srv/mediawiki/php-1.35.0-wmf.37/includes/deferred/DeferredUpdates.php(278): DeferredUpdates::run(RefreshSecondaryDataUpdate, Wikimedia\Rdbms\LBFactoryMulti, Monolog\Logger, BufferingStatsdDataFactory, string)
#5 /srv/mediawiki/php-1.35.0-wmf.37/includes/deferred/DeferredUpdates.php(194): DeferredUpdates::handleUpdateQueue(array, string, integer)
#6 /srv/mediawiki/php-1.35.0-wmf.37/includes/MediaWiki.php(1072): DeferredUpdates::doUpdates(string)
#7 /srv/mediawiki/php-1.35.0-wmf.37/includes/MediaWiki.php(849): MediaWiki->restInPeace()
#8 /srv/mediawiki/php-1.35.0-wmf.37/includes/MediaWiki.php(861): MediaWiki->{closure}()
#9 /srv/mediawiki/php-1.35.0-wmf.37/includes/MediaWiki.php(582): MediaWiki->doPostOutputShutdown()
#10 /srv/mediawiki/php-1.35.0-wmf.37/index.php(47): MediaWiki->run()
#11 /srv/mediawiki/w/index.php(3): require(string)
#12 {main}

I think @DannyS712 has probably diagnosed the issue correctly: Flow was changed to use different hooks that have different timing, and this Flow code was making assumptions about when transactions would start and end that were valid before but are broken now.

Change 606062 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/Flow@master] Revert "Hooks: Use PageMoveComplete instead of TitleMoveCompleting"

https://gerrit.wikimedia.org/r/606062

Change 606063 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Revert "Hard deprecate the TitleMoveCompleting hook"

https://gerrit.wikimedia.org/r/606063

Per @DannyS712's suggestion, I'm proposing to revert both the Flow change (so that we use the old hook again), and the core hard deprecation change (to avoid deprecation warnings), as a short-term solution because this a UBN.

The crux of the issue is that Flow used a TitleMoveStarting hook to begin a transaction, and a TitleMoveCompleting hook to commit it. This worked because both of those hooks are inside the transaction started and committed by MovePage::moveUnsafe(). Then the Flow code was changed to use PageMoveStarting (which is still inside the MovePage transaction) and PageMoveComplete (which is outside the transaction, and happens much later than PageMoveCompleting did). This caused the two transactions to be misnested: the Flow transaction was opened after (so, inside) the MovePage transaction, but then the MovePage transaction was committed before the Flow transaction had been committed. This causes a DB error (which is a good thing, we should be catching things like this).

For an actual solution, we should either add a PageMoveCompleting hook that's inside the transaction, in the same place where TitleMoveCompleting is (as @DannyS712 had originally proposed), or find a different way for the Flow to do what it needs to do. I think the former is the most promising, because the TitleMoveStarting hook was added specifically for this code (see T127785). It seems unlikely that we can (easily) rewrite the Flow code to not use a hook that was specifically created for it. Instead, I think we should just rename the TitleMoveCompleting hook to PageMoveCompleting. Trying to reduce hook redundancy is generally good, but here it looks like we do actually need separate hooks.

Per @DannyS712's suggestion, I'm proposing to revert both the Flow change (so that we use the old hook again), and the core hard deprecation change (to avoid deprecation warnings), as a short-term solution because this a UBN.

Will +2 the reverts against master once Jenkins confirms there are no other issues
Can you add your explanation to T250023: Replace TitleMoveComplet(e|ing) hooks regarding the need for two separate hooks?

Change 606063 merged by jenkins-bot:
[mediawiki/core@master] Revert "Hard deprecate the TitleMoveCompleting hook"

https://gerrit.wikimedia.org/r/606063

Change 606062 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Revert "Hooks: Use PageMoveComplete instead of TitleMoveCompleting"

https://gerrit.wikimedia.org/r/606062

Change 606068 had a related patch set uploaded (by DannyS712; owner: Catrope):
[mediawiki/core@wmf/1.35.0-wmf.37] Revert "Hard deprecate the TitleMoveCompleting hook"

https://gerrit.wikimedia.org/r/606068

Change 606069 had a related patch set uploaded (by DannyS712; owner: Catrope):
[mediawiki/extensions/Flow@wmf/1.35.0-wmf.37] Revert "Hooks: Use PageMoveComplete instead of TitleMoveCompleting"

https://gerrit.wikimedia.org/r/606069

Change 606071 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/core@master] Add PageMoveCompleting hook, to replace TitleMoveCompleting

https://gerrit.wikimedia.org/r/606071

I've registered the two revert patches for the upcoming backport window in ~2.5 hours.

UBN patches don't wait for a window. Deploying them now.

Change 606068 merged by jenkins-bot:
[mediawiki/core@wmf/1.35.0-wmf.37] Revert "Hard deprecate the TitleMoveCompleting hook"

https://gerrit.wikimedia.org/r/606068

Change 606069 merged by jenkins-bot:
[mediawiki/extensions/Flow@wmf/1.35.0-wmf.37] Revert "Hooks: Use PageMoveComplete instead of TitleMoveCompleting"

https://gerrit.wikimedia.org/r/606069

Mentioned in SAL (#wikimedia-operations) [2020-06-17T09:11:05Z] <jforrester@deploy1001> Synchronized php-1.35.0-wmf.37/includes/HookContainer/DeprecatedHooks.php: T255608 Revert 'Hard deprecate the hook' (duration: 01m 05s)

Mentioned in SAL (#wikimedia-operations) [2020-06-17T09:16:19Z] <jforrester@deploy1001> Synchronized php-1.35.0-wmf.37/extensions/Flow/: T255608 Revert 'Hooks: Use PageMoveComplete instead of TitleMoveCompleting' (duration: 01m 05s)

Jdforrester-WMF assigned this task to Catrope.

Deployed, tested.

Change 606071 merged by jenkins-bot:
[mediawiki/core@master] Add PageMoveCompleting hook, to replace TitleMoveCompleting

https://gerrit.wikimedia.org/r/606071