Page MenuHomePhabricator

[Flow] The Forum des nouveaux is broken
Closed, ResolvedPublicPRODUCTION ERROR

Description

The Forum des nouveaux (Newcomers help desk), the largest Flow board at French Wikipedia, was moved to a subpage. This broke the entire Flow page, leaving newcomers with no answers to questions they asked.
https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Forum_des_nouveaux/Archives_Flow

Error message:

The Structured Discussions workflow is not associated with this page.
[d9ceb316-1237-49b9-8fe3-91c477eb9976] 2024-08-01 08:58:23: Fatal exception of type "Flow\Exception\InvalidDataException"

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

From quickly looking at the Flow docs, it says:

To enable it on a single page, use Special:EnableStructuredDiscussions.

So it seems like https://fr.wikipedia.org/wiki/Sp%C3%A9cial:EnableStructuredDiscussions might be worth a try?

Though I haven't tried it yet.

So it seems like https://fr.wikipedia.org/wiki/Sp%C3%A9cial:EnableStructuredDiscussions might be worth a try?

Though I haven't tried it yet.

Errors out with "There is already a Structured Discussions board at Wikipédia:Forum des nouveaux/Archives Flow." (which makes sense).

Tried the fix script:

[urbanecm@mwmaint1002 ~]$ mwscript extensions/Flow/maintenance/FlowFixInconsistentBoards.php --wiki=frwiki
INCONSISTENT: Core title for 's1q99xy15k6y1544' is 'Wikipédia:Forum des nouveaux/Archives Flow', but Flow title is 'Wikipédia:Forum des nouveaux'
FIXED: Updated 's1q99xy15k6y1544' to match core title, 'Wikipédia:Forum des nouveaux/Archives Flow'

no luck so far it seems.

But...I also can't find the log anywhere in Logstash. Why?

Okay, I can't really debug this without a trace, so I obtained it by temporarily turning wgShowExceptionDetails to true at a mwdebug server. From there, I got the following:

[de8cd9cf-28a6-9c1b-ae3b-257e5ac1a565] /wiki/Wikip%C3%A9dia:Forum_des_nouveaux/Archives_Flow Flow\Exception\InvalidDataException: Flow workflow is for different page

Backtrace:

from /srv/mediawiki/php-1.43.0-wmf.15/extensions/Flow/includes/WorkflowLoaderFactory.php(132)
#0 /srv/mediawiki/php-1.43.0-wmf.15/extensions/Flow/includes/WorkflowLoaderFactory.php(105): Flow\WorkflowLoaderFactory->loadWorkflowById(MediaWiki\Title\Title, Flow\Model\UUID)
#1 /srv/mediawiki/php-1.43.0-wmf.15/extensions/Flow/includes/Actions/FlowAction.php(96): Flow\WorkflowLoaderFactory->createWorkflowLoader(MediaWiki\Title\Title)
#2 /srv/mediawiki/php-1.43.0-wmf.15/extensions/Flow/includes/Actions/ViewAction.php(26): Flow\Actions\FlowAction->showForAction(string, MediaWiki\Output\OutputPage)
#3 /srv/mediawiki/php-1.43.0-wmf.15/extensions/Flow/includes/Actions/FlowAction.php(50): Flow\Actions\ViewAction->showForAction(string)
#4 /srv/mediawiki/php-1.43.0-wmf.15/includes/actions/ActionEntryPoint.php(731): Flow\Actions\FlowAction->show()
#5 /srv/mediawiki/php-1.43.0-wmf.15/includes/actions/ActionEntryPoint.php(508): MediaWiki\Actions\ActionEntryPoint->performAction(Article, MediaWiki\Title\Title)
#6 /srv/mediawiki/php-1.43.0-wmf.15/includes/actions/ActionEntryPoint.php(145): MediaWiki\Actions\ActionEntryPoint->performRequest()
#7 /srv/mediawiki/php-1.43.0-wmf.15/includes/MediaWikiEntryPoint.php(200): MediaWiki\Actions\ActionEntryPoint->execute()
#8 /srv/mediawiki/php-1.43.0-wmf.15/index.php(58): MediaWiki\MediaWikiEntryPoint->run()
#9 /srv/mediawiki/w/index.php(3): require(string)
#10 {main}

In the trace, WorkflowLoaderFactory complains about a mismatch between the title recorded in the workflow and the accessed title. Indeed:

> $c = require "$IP/extensions/Flow/container.php";
> $f = $c['factory.loader.workflow']
> $f->createWorkflowLoader(Title::newFromText('Wikipédia:Forum_des_nouveaux/Archives_Flow'))
   Flow\Exception\InvalidDataException  Flow workflow is for different page.
> $wikipage = \MediaWiki\MediaWikiServices::getInstance()->getWikiPageFactory()->newFromTitle(Title::newFromText('Wikipédia:Forum_des_nouveaux/Archives_Flow'))
= WikiPage {#8996}

> $content = $wikipage->getContent()
= Flow\Content\BoardContent {#8851}

> $content->getWorkflowId()
= Flow\Model\UUID {#6233}

> $workflowId = $content->getWorkflowId()
= Flow\Model\UUID {#6233}

> $storage = $c['storage']
= Flow\Data\ManagerGroup {#6195}

> $workflow = $storage->getStorage('Workflow')->get($workflowId)
= Flow\Model\Workflow {#6323}

> $workflow->getArticleTitle()->getPrefixedText()
= "Wikipédia:Forum des nouveaux"

>

From there, I checked the database:

wikiadmin2023@10.64.48.161(flowdb)> select * from flow_workflow where workflow_page_id=8284195 limit 1\G                                                                                                                                          
*************************** 1. row ***************************
                   workflow_id: !
                                 /0$
                 workflow_wiki: frwiki
            workflow_namespace: 4
              workflow_page_id: 8284195
           workflow_title_text: Forum_des_nouveaux/Archives_Flow
                 workflow_name:
workflow_last_update_timestamp: 20231007075831
              workflow_user_id: NULL
           workflow_lock_state: 0
        workflow_definition_id: NULL
              workflow_user_ip: NULL
            workflow_user_wiki: NULL
                 workflow_type: discussion
1 row in set (0.027 sec)

wikiadmin2023@10.64.48.161(flowdb)>

The title is set correctly (presumably, the script fixed that part). This makes me blame the cache. Let's try purging it:

> $workflowStorage = $storage->getStorage('Workflow')
= Flow\Data\ObjectManager {#6291}

> $workflowStorage->cachePurge($workflow)
= null

>

and...https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Forum_des_nouveaux/Archives_Flow is working now, good as new.

There are a couple of questions left:

  1. why did the move break the page? @Michael mentioned doing so locally works with no issues.
  2. why was there no sign of the error in Logstash? What made that part to hide?
  3. can we expect something similar to happen once communities start to archive pages as part of Flow deprecations? Can/should we do something to avoid that? If so, what?

Those questions should be probably discussed and determined within separate tasks. @Michael already filled T371586: Flow internal error on frwiki not in logstash for the second problem, The first issue is probably tracked across multiple tasks, such as T112954: Exception when moving Flow board. Not sure how pressing no 3 is – @Trizek-WMF would likely know more. Maybe we asked everyone to archive already, and the volume of issues didn't go up much.

Aklapper changed the subtype of this task from "Task" to "Production Error".

Fatal exception => Wikimedia-production-error

Urbanecm_WMF triaged this task as Medium priority.
Urbanecm_WMF moved this task from Incoming to QA on the Growth-Team (FY2024-25 Q1 Sprint 2) board.

Shouldn't we rename this task to highlight the investigation phase, as the initial problem is now resolved?

Shouldn't we rename this task to highlight the investigation phase, as the initial problem is now resolved?

The investigation part should probably have its own task. Fixing a single occurance of this problem is relatively easy (run the script, and purge the cache if needed).


Some follow-up tasks were filled here:

The last task can be used for investigation – this task is likely best kept as specific to the frwiki page. Feel free to fill similar tasks if a different page breaks, it should now be fairly easy to fix. The root cause is going to be more difficult, especially since according to @Michael's tests, it doesn't have to happen every time :/.