Page MenuHomePhabricator

Fix user talk pages already in inconsistent state due to to T138310
Closed, ResolvedPublic

Description

New cases should no longer be occurring, but we need to fix existing cases. Please add them here with an empty check box.

You can reproduce and test the issue that the script is intended to fix with:

  1. Create a Flow board
  2. Do an UPDATE adding '/Flow_Archive_1' to workflow_title_text in the flow_workflow table. E.g.: UPDATE flow_workflow SET workflow_title_text = 'Test_2016-12-01/Flow_Archive_1' WHERE workflow_title_text = 'Test_2016-12-01';
  3. Bump $wgFlowCacheVersion.
  4. Run the script (for all namespaces, or including that namespace). E.g.: mwscript extensions/Flow/maintenance/FlowFixInconsistentBoards.php --wiki=wiki

You should see one or more INCONSISTENT lines, and then an indication that they are fixed. When you query, it should be back to the original title text: SELECT * FROM flow_workflow WHERE workflow_title_text = 'Test_2016-12-01';


Template for manual fixes until we have the script.

To cross-check affected workflows:

SELECT workflow_wiki, workflow_namespace, workflow_title_text, workflow_page_id, HEX(workflow_id) FROM flow_workflow WHERE workflow_wiki = '<>' AND workflow_namespace = 3 AND workflow_title_text = '<>/Flow_Archive_1';

To update:

UPDATE flow_workflow SET workflow_title_text = '' WHERE HEX(workflow_id) IN ('') AND workflow_wiki = '' AND workflow_namespace = 3 AND workflow_title_text = '';

Event Timeline

bd808 updated the task description. (Show Details)Oct 13 2016, 7:15 PM

I added @aude's wikidata user page which is much more important to fix than mine.

@SBisson asked at T138310:

I am creating a sub-task for fixing the existing pages.

Good. Is there a systematic way to find those?

I think I can write a maintenance script to first dump them then fix it. I was thinking about whether it was worth it, but it probably is.

Mentioned in SAL (#wikimedia-operations) [2016-10-13T20:26:25Z] <matt_flaschen> Ran manual DB updates for T148057.

@aude @bd808 Your pages are fixed. Sorry for the inconvenience.

Framawiki updated the task description. (Show Details)Oct 14 2016, 9:03 AM
aude awarded a token.Oct 14 2016, 12:37 PM

@Doror @Robor15 Your pages are fixed. Sorry for the inconvenience. The script is coming soon.

What's new about that task?

Change 322223 had a related patch set uploaded (by Mattflaschen):
Add script to fix inconsistent state for board name

https://gerrit.wikimedia.org/r/322223

Change 322223 merged by jenkins-bot:
Add script to fix inconsistent state for board name

https://gerrit.wikimedia.org/r/322223

Mattflaschen-WMF updated the task description. (Show Details)

Change 325730 had a related patch set uploaded (by Mattflaschen):
FlowFixInconsistentBoards: Don't output non-critical error info

https://gerrit.wikimedia.org/r/325730

Change 325730 merged by jenkins-bot:
FlowFixInconsistentBoards: Don't output non-critical error info

https://gerrit.wikimedia.org/r/325730

Change 327402 had a related patch set uploaded (by Mattflaschen):
FlowFixInconsistentBoards: Don't output non-critical error info

https://gerrit.wikimedia.org/r/327402

Beta is done. I've scheduled a SWAT for tomorrow morning to get https://gerrit.wikimedia.org/r/#/c/327402/ on all wikis.

Then, I'll do the dry run immediately after that is deployed (or at the end of the window if there are actually other deployments in the SWAT).

Change 327417 had a related patch set uploaded (by Mattflaschen):
FlowFixInconsistentBoards: Run in update.php, fix updatelog

https://gerrit.wikimedia.org/r/327417

I made a mistake before with making update.php-able. It wasn't logging that it ran.

This:

  1. Caused it to re-run on Beta every time.
  2. Meant running it manually before won't prevent it from running in update.php after that's re-enabled.

But I timed everything. It's now fast enough (less than 2 minutes total for all wikis not including enwiki and en_rtlwiki, plus about the same for just enwiki, and even less for en_rtlwiki) that we can just leave it to update.php.

If someone objects to that, though, I can split https://gerrit.wikimedia.org/r/327417 and re-run it manually once again after the first half of the split is merged.

https://gerrit.wikimedia.org/r/#/c/327417/1 is also a blocker for running it for real in production (although we don't use update.php, I feel like the prod updatelog should be accurate, for --force etc.), but not a blocker for the dry run.

Change 327402 merged by jenkins-bot:
FlowFixInconsistentBoards: Don't output non-critical error info

https://gerrit.wikimedia.org/r/327402

Reedy removed a subscriber: Reedy.Dec 15 2016, 1:59 PM

Mentioned in SAL (#wikimedia-operations) [2016-12-15T14:27:14Z] <zfilipin@tin> Synchronized php-1.29.0-wmf.5/extensions/Flow: SWAT: [[gerrit:327402|FlowFixInconsistentBoards: Dont output non-critical error info (T148057)]] (duration: 00m 56s)

Change 327417 merged by jenkins-bot:
FlowFixInconsistentBoards: Run in update.php, fix updatelog

https://gerrit.wikimedia.org/r/327417

Change 330611 had a related patch set uploaded (by Mattflaschen):
FlowFixInconsistentBoards: Run in update.php, fix updatelog

https://gerrit.wikimedia.org/r/330611

Change 330611 merged by jenkins-bot:
FlowFixInconsistentBoards: Run in update.php, fix updatelog

https://gerrit.wikimedia.org/r/330611

Mattflaschen-WMF added a comment.EditedJan 6 2017, 4:09 AM

Last production dry run completed after T154724: Workflow ID null for some user talk pages on mediawikiwiki was fixed: P4715

There were 606 inconsistent (and I believe all blank) Flow boards, all on gomwiki ("JSON content does not have a valid workflow ID."). These are messy, but have no user impact AFAICT. See T154623: Workflow ID null for some user talk pages on gomwiki.

Other than that, there were only two errors, both of a concerning kind (T154741):

ERROR: 'User talk:Eimanahmed80' refers to workflow ID 'suhadmgqu7o4iwgg', which could not be found.
ERROR: '<REDACTED>' refers to workflow ID 'sgkoegybywkraxl3', which could not be found.

This problem is not caused by T138310 to my knowledge, and I won't be taking any action on it now. Maybe the Flow DB rolled back and core did not.

There are 36 INCONSISTENT ones (that the script is intended to fix), all of which happen to be fixable (counting testwiki, not counting the ones fixed earlier) :).

Production runs complete:

One-offs using the script with a limit (to make sure the script didn't do anything crazy before running it on all wikis):

All remaining pages on all wikis: P4718

It logs look good, and I spot-checked the results. All should be fixed now, if your page still doesn't work, let us know.

Sorry for the inconvenience.

QA recommendation: Resolve.