Page MenuHomePhabricator

Flow: could not locate workflow for revision {rev}
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error

MediaWiki version: 1.36.0-wmf.21

message
could not locate workflow for revision vujehhx7ei8fvbdp

Impact

Notes

Happening on frwiki at a low frequency

Details

Request ID
X9pDAQpAADgAAH1uuacAAAAB
Request URL
https://fr.wikipedia.org/wiki/Sp%C3%A9cial:V%C3%A9rificateur_d%27utilisateur
Stack Trace
exception.trace
#0 /srv/mediawiki/php-1.36.0-wmf.21/extensions/Flow/includes/Formatter/CheckUserQuery.php(90): Flow\Formatter\AbstractQuery->buildResult(Flow\Model\Header, string, Flow\Formatter\FormatterRow)
#1 /srv/mediawiki/php-1.36.0-wmf.21/extensions/Flow/includes/Hooks.php(672): Flow\Formatter\CheckUserQuery->getResult(stdClass)
#2 /srv/mediawiki/php-1.36.0-wmf.21/extensions/Flow/includes/Hooks.php(635): Flow\Hooks::getReplacementRowItems(RequestContext, stdClass)
#3 /srv/mediawiki/php-1.36.0-wmf.21/includes/HookContainer/HookContainer.php(333): Flow\Hooks::onSpecialCheckUserGetLinksFromRow(SpecialCheckUser, stdClass, array)
#4 /srv/mediawiki/php-1.36.0-wmf.21/includes/HookContainer/HookContainer.php(140): MediaWiki\HookContainer\HookContainer->callLegacyHook(string, array, array, array)
#5 /srv/mediawiki/php-1.36.0-wmf.21/includes/Hooks.php(137): MediaWiki\HookContainer\HookContainer->run(string, array, array)
#6 /srv/mediawiki/php-1.36.0-wmf.21/extensions/CheckUser/includes/specials/SpecialCheckUser.php(2010): Hooks::run(string, array)
#7 /srv/mediawiki/php-1.36.0-wmf.21/extensions/CheckUser/includes/specials/SpecialCheckUser.php(1782): SpecialCheckUser->getLinksFromRow(stdClass)
#8 /srv/mediawiki/php-1.36.0-wmf.21/extensions/CheckUser/includes/specials/SpecialCheckUser.php(1029): SpecialCheckUser->CUChangesLine(stdClass)
#9 /srv/mediawiki/php-1.36.0-wmf.21/extensions/CheckUser/includes/specials/SpecialCheckUser.php(792): SpecialCheckUser->doIPEditsRequestOutput(Wikimedia\Rdbms\ResultWrapper, string, boolean)
#10 /srv/mediawiki/php-1.36.0-wmf.21/extensions/CheckUser/includes/specials/SpecialCheckUser.php(152): SpecialCheckUser->doIPEditsRequest(string, boolean, integer)
#11 /srv/mediawiki/php-1.36.0-wmf.21/includes/specialpage/SpecialPage.php(645): SpecialCheckUser->execute(NULL)
#12 /srv/mediawiki/php-1.36.0-wmf.21/includes/specialpage/SpecialPageFactory.php(1403): SpecialPage->run(NULL)
#13 /srv/mediawiki/php-1.36.0-wmf.21/includes/MediaWiki.php(310): MediaWiki\SpecialPage\SpecialPageFactory->executePath(Title, RequestContext)
#14 /srv/mediawiki/php-1.36.0-wmf.21/includes/MediaWiki.php(945): MediaWiki->performRequest()
#15 /srv/mediawiki/php-1.36.0-wmf.21/includes/MediaWiki.php(548): MediaWiki->main()
#16 /srv/mediawiki/php-1.36.0-wmf.21/index.php(53): MediaWiki->run()
#17 /srv/mediawiki/php-1.36.0-wmf.21/index.php(46): wfIndexMain()
#18 /srv/mediawiki/w/index.php(3): require(string)
#19 {main}

Event Timeline

thcipriani renamed this task from could not locate workflow for revision vujehhx7ei8fvbdp to Flow: could not locate workflow for revision {rev}.Dec 16 2020, 7:24 PM
thcipriani updated the task description. (Show Details)
brennen added a subscriber: brennen.

Saw a spike of these under 1.36.0-wmf.27.

MMiller_WMF edited projects, added Growth-Team (Current Sprint); removed Growth-Team.
MMiller_WMF added a subscriber: MMiller_WMF.

Since this continues to cause problems, @Tgr is going to work on it this week.

This is triggered by a CheckUser hook that generates related links (like "edits from the same IP as this edit"). Since CheckUser renders big lists with lots of edits, and this hook runs for every list item, and the error is not fatal (Flow catches and logs it and continues) this error is generated in big batches. So unless the spike is huge, it is nothing to worry about - just someone clicking around on a CheckUser list where most of the edits are Flow-related.

Picking a workflow ID from a recent error message:

tgr@mwmaint1002:~$ mwscript shell.php frwiki
>>> $storage = \Flow\Container::get( 'storage' )
>>> $treeRepository = \Flow\Container::get( 'repository.tree' )
>>> $postRevision = $storage->get( 'PostRevision', '<id>' )
>>> $rootUid = $treeRepository->findRoot( $postRevision->getRevisionId() )
>>> $storage->get( 'Workflow', $rootUid->getAlphadecimal() )
=> Flow\Model\Workflow

works just fine. (Also, neither the revision/comment nor the workflow/topic are oversighted or deleted or anything else unusual.) Looking at the code, the only way this can happen is AbstractQuery::getWorkflow returning null for the post revision, which in turn can only happen by AbstractQuery::getWorkflowById returning null for the workflow ID, even though the ID lookup was successful. That method does an in-process cache lookup, then it falls back to the same $storage->get operation as above, so either that fails for some reason, or the cache somehow contains incorrect null/falsy entries. I can't see any reason for the DB lookup failing (the discussion is much older than the error so it's not a replication issue). I tried to test initialization with some data looked up from the cu_changes table where cuc_type is RC_FLOW (142) but I just get a bunch of errors so I'm probably doing something wrong:

>>> $query = new \Flow\Formatter\CheckUserQuery( $storage, $treeRepository )
>>> $query->loadMetadataBatch( [ (object)[ 'cuc_id' => ..., 'cuc_type' => ..., 'cuc_comment' => ... ] ] )
PHP Notice:  mysqli::query(): send of 166 bytes failed with errno=32 Broken pipe in /srv/mediawiki/php-1.36.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqli.php on line 46
PHP Warning:  Error while sending QUERY packet. PID=171770 in /srv/mediawiki/php-1.36.0-wmf.27/includes/libs/rdbms/database/DatabaseMysqli.php on line 46

Another exception that occurs in the same request is Flow\Exception\InvalidDataException: Revisions for <topic ID> could not be found, from AbstractCollection::getAllRevisions(). Simulating the revision lookup done there looks fine as well. Maybe a problem with some internal cache layer?

The data structure looks something like this:

  • there is a Workflow with UID 1
  • the first PostRevision of that workflow has revision id UID 2, post id UID 2 (the same)
  • the root PostRevision (retrieved via getRootPost()) of the first post has revision id UID 3, post id UID 1
  • there is also a Header object with object id UID 2, collection id UID 2, workflow id UID 2

The CU record contains UID 1 as workflow id and UID 2 as revision id. When CheckUserQuery::loadMetadataBatch() loads this row, it ends up with a revisionCache value of [ UID 2 => Header ]. When it tries to build the result, it fetches the header from the cache, and tries to get the workflow. There is no workflow with UID 2. Not quite sure what the expected behavior here is - should the revision cache contain the PostRevision object with the same id (which has the correct workflow id) instead of the Header? Or is the problem with the Header having a non-existent workflow id? Or both?

Change 660079 had a related patch set uploaded (by Gergő Tisza; owner: Gergő Tisza):
[mediawiki/extensions/Flow@master] Fix metadata loading logic in CheckUserQuery

https://gerrit.wikimedia.org/r/660079

Flow catches and logs these exceptions, and backs out from trying to modify the formatting of the changelist item, so the user-visible effect of these errors was that Flow checkuser rows were somewhat unhelpful (they contained the raw MediaWiki UUID name instead of the human-readable topic name, and no diff/history links). This affected all log entries except thise few which were actually header changes.

kostajh added a subscriber: kostajh.

Sorry for the delay in reviewing, @Tgr. I wasn't able to reproduce this locally, but the code seems sensible to me. @Etonkovidova could you please double-check this on Tuesday when the patch is in group0?

Change 660079 merged by jenkins-bot:

[mediawiki/extensions/Flow@master] Fix metadata loading logic in CheckUserQuery

https://gerrit.wikimedia.org/r/660079

Sorry for the delay in reviewing, @Tgr. I wasn't able to reproduce this locally, but the code seems sensible to me. @Etonkovidova could you please double-check this on Tuesday when the patch is in group0?

Checked on wmf.20 - the error is not present.