Page MenuHomePhabricator

Upgrade from 1.31.1 to 1.32.0 breaks existing StructuredDiscussions pages: error message: **flow-error-unknown-workflow-id**
Closed, ResolvedPublic

Description

After upgrading from MediaWiki version 1.31.1 to 1.32.0 existing discussion pages using StructuredDiscussions cannot be retrieved (error message: flow-error-unknown-workflow-id). When there is no existing discussion page, new topics can be created without errors.

If it can be of any help, I tried requesting the pages with safemode=1 and nothing changes.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Aklapper renamed this task from Upgrade from 1.31.1 to 1.32.0 breaks existing StructuredDiscussions pages to Upgrade from 1.31.1 to 1.32.0 breaks existing StructuredDiscussions pages: error message: **flow-error-unknown-workflow-id**.Feb 28 2019, 3:00 PM

When upgrading, did you do anything unusual to the database? Does the database contain a flow_workflow table, and does that table contain entries for old SD pages or just for new ones?

(This is going to be hard to investigate without access to your wiki and to your database, but I'll do my best.)

I will try to give more details.

The two wikis (version 1.31.1 and 1.32.0) use two distinct databases. I copied the 1.31 database into the 1.32 database. I checked the flow_* tables, echo_* tables and page table after the upgrade, and they are identical.

When requesting a specific discussion page, in the Queries section of the debugging toolbar, in version 1.31.1, the first reference to a flow_* table is:

13	SELECT old_text,old_flags FROM `wiki_text` WHERE old_id = '639' LIMIT 1 	0.1950ms	MediaWiki\Storage\SqlBlobStore::fetchBlob
14	SELECT * FROM `wiki_flow_workflow` WHERE ((workflow_id = '�■■2w%■■l■Q')) 	17.9760ms	Flow\Data\Storage\BasicDbStorage::doFindQuery (flow_workflow)
15	SELECT * FROM `wiki_flow_revision` `rev` WHERE rev_type_id = '�■■2w%■■l■Q' AND rev_type = 'header' ORDER BY rev_id DESC LIMIT 501 	17.3268ms	Flow\Data\Storage\RevisionStorage::findInternal
16	SELECT cl_to FROM `wiki_categorylinks` WHERE cl_from = '173' 	0.2310ms	Title::getParentCategories

In version 1.32:

13	SELECT old_text,old_flags FROM `wiki_text` WHERE old_id = '639' LIMIT 1 	0.3111ms	MediaWiki\Storage\SqlBlobStore::fetchBlob
14	SELECT * FROM `wiki_flow_workflow` WHERE ((workflow_id = '�■■2w%■■l■Q')) 	8.7900ms	Flow\Data\Storage\BasicDbStorage::doFindQuery (flow_workflow)
15	ROLLBACK	0.1559ms	MWExceptionHandler::rollbackMasterChangesAndLog
16	SELECT keyname,value,exptime FROM `wiki_objectcache` WHERE keyname = 'Wiki_32_0-wiki_:messages:it' 	0.3350ms	SqlBagOStuff::getMulti

I assume the query in the text table is used to retrieve the workflow_id. The old_text values for old_id = '639' in the two databases are identical. In the workflow table, for this page:

SELECT *
FROM `wiki_flow_workflow`
where workflow_title_text = '$page_title'

there are 6 rows, one with workflow_type = 'discussion' and five with workflow_type = 'topic'. I assume the workflow_id above refers to the 'discussion' row. All six rows in the two databases, however, match.

So, as far as I can understand, the problem does not seem to be in the database.

Could you get a few more pieces of information for me?

  1. In both databases, run SELECT workflow_type, HEX(workflow_id) FROM wiki_flow_workflow WHERE workflow_title_text = '$page_title' and check if the results match (you should get 6 rows each).
  1. Use api.php to find the workflow ID referred to by the page text. The following example does this for Project:Support_desk on mediawiki.org, adjust for your wiki and your page name: https://www.mediawiki.org/w/api.php?action=query&prop=revisions&rvprop=content&format=jsonfm&titles=Project:Support_desk (in this example, the workflow ID is sm33nl9cicz8qyh1)
  1. Convert the workflow ID from step 2 to hex format by running php maintenance/eval.php on the command line and entering echo Flow\Model\UUID::create('workflow ID here')->getHex(); you should get a result that looks something like 053b84dc09f67c31f9e505
  1. Check if the result from step 3 appears anywhere in the results from step 1 (apart from uppercase/lowercase letter differences). Also run the query SELECT * FROM flow_workflow WHERE workflow_id = UNHEX('result of step 3 here') on both databases and see if you get a result (and whether it's the same result)
  1. In both databases, run SELECT workflow_type, HEX(workflow_id) FROM wiki_flow_workflow WHERE workflow_title_text = '$page_title' and check if the results match (you should get 6 rows each).
workflow_type	HEX(workflow_id) 	
discussion 	05A1E9327725C7F36CDD51
topic 		05A1E932773DC7F36CDD51
topic 		05A1E94729B5C7F36CDD51
topic 		05A1E9A1B669C7F36CDD51
topic 		05A1E9D2B5D9C7F36CDD51
topic 		05A1E9DF219DC7F36CDD51

in both databases

  1. Use api.php to find the workflow ID referred to by the page text. The following example does this for Project:Support_desk on mediawiki.org, adjust for your wiki and your page name: https://www.mediawiki.org/w/api.php?action=query&prop=revisions&rvprop=content&format=jsonfm&titles=Project:Support_desk (in this example, the workflow ID is sm33nl9cicz8qyh1)

Same result in both wikis (except for the color of the JSON output), but no workflow ID:

{
    "batchcomplete": "",
    "query": {
        "normalized": [
            {
                "from": "CR-5852_-_Rivistazione_cruscotto_operatore_\\\"gestione_SIM\\\"/Test/INT/OLCEnquiry",
                "to": "CR-5852 - Rivistazione cruscotto operatore \\\"gestione SIM\\\"/Test/INT/OLCEnquiry"
            }
        ],
        "pages": {
            "-1": {
                "ns": 0,
                "title": "CR-5852 - Rivistazione cruscotto operatore \\\"gestione SIM\\\"/Test/INT/OLCEnquiry",
                "missing": ""
            }
        }
    }
}

If I understand correctly, the workflow ID is recorded into text.old_text:

SELECT * FROM wiki_text WHERE old_id = 639;

	{"flow-workflow":"ustpj6z6wrlwc2tt"}

(see query 13 in my previous comment) same value in both databases

  1. Convert the workflow ID from step 2 to hex format by running php maintenance/eval.php on the command line and entering echo Flow\Model\UUID::create('workflow ID here')->getHex(); you should get a result that looks something like 053b84dc09f67c31f9e505
> echo Flow\Model\UUID::create('ustpj6z6wrlwc2tt')->getHex();
05a1e9327725c7f36cdd51

in both wikis

  1. Check if the result from step 3 appears anywhere in the results from step 1 (apart from uppercase/lowercase letter differences). Also run the query SELECT * FROM flow_workflow WHERE workflow_id = UNHEX('result of step 3 here') on both databases and see if you get a result (and whether it's the same result)

It is the workflow_id of the "discussion" row from step 1. The same row is selected by the "unhex" query in both databases

Problem found. First, I apologize for wasting your time. I should have analyzed the problem more thoroughly before opening this task.

The problem was caused by using a new database for the new wiki and leaving the old database as a backup. The workflow ID was retrieved correctly, but an exception was raised in Flow/includes/WorkflowLoaderFactory.php, line 126:

		if ( $workflow->getWiki() !== wfWikiID() ) {
			throw new UnknownWorkflowIdException( 'The requested workflow does not exist on this wiki.' );
		}

because LHS refers to the stored value (using old database name) and RHS to configured value (using new database name). I suppose the rationale of this check is to allow for sharing the same db among different wikis. However, I thought that it was safe to use a different db as long as configuration was modified accordingly.

Just one final question: I didn't find any reference to the exception argument ('The requested workflow does not exist on this wiki.' ) in the $wgDebugLogFile file. Does it appear somewhere else?

Thanks,

Carlo

Oh, aha, I see. Yeah the table is intended to be shared between multiple wikis. In your case you can probably just do something like UPDATE flow_revision SET workflow_wiki='newwikiname' AND workflow_user_wiki='newwikiname'; to fix the data in the table. Glad you found the issue, this should be easy to solve.

I'll look into where the exception argument goes. This would have been a lot easier to debug if the argument had been reported somewhere. The fact that it wasn't included in your report probably indicates that we're throwing it away, which we shouldn't be doing.

Change 495019 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/Flow@master] Don't override message and stack trace for UnknownWorkflowIdException

https://gerrit.wikimedia.org/r/495019

Change 495019 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Don't override message and stack trace for UnknownWorkflowIdException

https://gerrit.wikimedia.org/r/495019

Catrope claimed this task.