Page MenuHomePhabricator

Remove reference to text fields replaced by the comment table from WMCS views
Closed, ResolvedPublic

Description

The next step for T166733: Deploy refactored comment storage is to finally remove most of the old comment fields. Before we can drop them in WMF production, we'll have to make the following changes to the WMCS views.

Non-compat view field removals:

  • archive: Remove ar_comment
  • archive_userindex: Remove ar_comment
  • filearchive:
    • Remove fa_deleted_reason
    • Remove fa_description
  • filearchive_userindex:
    • Remove fa_deleted_reason
    • Remove fa_description
  • image: Remove img_description
  • ipblocks: Remove ipb_reason.
  • ipblocks_ipindex: Remove ipb_reason.
  • logging: Remove log_comment.
  • logging_logindex: Remove log_comment.
  • logging_userindex: Remove log_comment.
  • oldimage: Remove oi_description
  • oldimage_userindex: Remove oi_description
  • recentchanges: Remove rc_comment.
  • recentchanges_userindex: Remove rc_comment.
  • revision: Remove rev_comment.
  • revision_userindex: Remove rev_comment.

Compat field changes:

  • archive_compat: (no changes)
  • filearchive_compat:
    • Change the alias for fa_deleted_reason to just comment_a.comment_text as fa_deleted_reason.
    • Change the alias for fa_description to just if(fa_deleted&2,NULL,comment_b.comment_text) as fa_description.
  • image_compat: Change the alias for img_description to just comment_text as img_description.
  • ipblocks_compat: Change the alias for ipb_reason to just comment_text as ipb_reason.
  • logging_compat: Change the alias for log_comment to just if(log_deleted&2,null,comment_text) as log_comment.
  • oldimage_compat: Change the alias for oi_description to just if(oi_deleted&2,null,comment_text) as oi_description.
  • protected_titles_compat: Change the alias for pt_reason to just comment_text as pt_reason.
  • recentchanges_compat: Change the alias for rc_comment to just if(rc_deleted&2,null,comment_text) as rc_comment.
  • revision_compat: Change the alias for rev_comment to just if(rev_deleted&2,null,comment_text) as rev_comment.

You might also change the various LEFT JOINs with the comment table in the compat views to be (INNER) JOINs instead, to more closely match what MediaWiki is going to be doing. On the other hand, leaving them as LEFT shouldn't hurt anything.

The end result should be no change in user-visible behavior for the compat views, as everything should already be hitting the comment_text cases, while users using the non-compat views will start getting errors about nonexistent fields if they haven't already updated their code to join with comment.

Event Timeline

Anomie created this task.Jan 4 2019, 8:31 PM
Anomie triaged this task as Low priority.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 4 2019, 8:31 PM
bd808 moved this task from Backlog to Wiki replicas on the Data-Services board.Jan 4 2019, 8:43 PM
Lofhi added a subscriber: Lofhi.EditedJan 9 2019, 4:46 PM

It is reported on T212617 that ar_comment_id is missing in archive_userindex table.
Can we add this missing column to the list?

Anomie added a comment.Jan 9 2019, 6:15 PM

That's out of scope for this task. T212617 is the task for that.

Anomie raised the priority of this task from Low to Needs Triage.Jan 9 2019, 6:16 PM

Change 489242 had a related patch set uploaded (by Anomie; owner: Anomie):
[operations/puppet@production] wiki replicas: Remove reference to old comment fields

https://gerrit.wikimedia.org/r/489242

Change 489242 merged by Bstorm:
[operations/puppet@production] wiki replicas: Remove reference to old comment fields

https://gerrit.wikimedia.org/r/489242

Anomie removed Anomie as the assignee of this task.Feb 25 2019, 4:08 PM

Change 493718 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wikireplicas: correct join for logging_compat

https://gerrit.wikimedia.org/r/493718

Change 493718 merged by Bstorm:
[operations/puppet@production] wikireplicas: correct join for logging_compat

https://gerrit.wikimedia.org/r/493718

Hi @Bstorm , given the merged patches above, I'm assuming comment-data is fully moved toward the comment table, and that we can (must) remove references to the comment-related fields in the data we gather from labsdb. Can you confirm? Many thanks :)

Bstorm added a comment.Mar 7 2019, 3:30 PM

I haven't had time to deploy it just yet across all replicas (it is deployed for huwiki and eswiki for testing because there was a broken column in them for some reason). I would say that will be true soon and will be true for new wikis. It needs announcing. The _compat tables will continue to have them.

Bstorm added a comment.Mar 7 2019, 3:31 PM

To be clear, there's an additional step after merge. This doesn't apply automatically to the databases on the replicas and must be manually run.

Bstorm added a comment.Mar 7 2019, 4:14 PM

Announcement sent with a date to fully deploy of 3/12

Thanks for the comments and the announcement @Bstorm . For the record, we have experienced problems with hewiki and eswiki, so I think the changes have applied for those two.

Bstorm added a comment.Mar 7 2019, 5:29 PM

Yes, I made a typo and said huwiki. Those two were applied early for a breakfix (there was something wrong with the view).

Bstorm added a comment.Mar 7 2019, 5:30 PM

If I roll this back to delay and whatnot, I can recreate those again with the columns :)

Bstorm added a comment.Mar 7 2019, 5:57 PM

Ok, the date is now set for 3/19 instead. I'll rebuild eswiki and hewiki again with the older version.

Change 494999 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: Remove reference to old comment fields

https://gerrit.wikimedia.org/r/494999

Change 494999 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: Remove reference to old comment fields

https://gerrit.wikimedia.org/r/494999

Change 494999 merged by Bstorm:
[operations/puppet@production] wiki replicas: Remove reference to old comment fields

https://gerrit.wikimedia.org/r/494999

Change 497652 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: depool labsdb1010 for comment refactor changes

https://gerrit.wikimedia.org/r/497652

Change 497652 merged by Bstorm:
[operations/puppet@production] wiki replicas: depool labsdb1010 for comment refactor changes

https://gerrit.wikimedia.org/r/497652

Change 497674 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: depool labsdb1011 for comment refactor changes

https://gerrit.wikimedia.org/r/497674

Change 497674 merged by Bstorm:
[operations/puppet@production] wiki replicas: depool labsdb1011 for comment refactor changes

https://gerrit.wikimedia.org/r/497674

Change 497679 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: depool labsdb1009 for views changes

https://gerrit.wikimedia.org/r/497679

Change 497679 merged by Bstorm:
[operations/puppet@production] wiki replicas: depool labsdb1009 for views changes

https://gerrit.wikimedia.org/r/497679

The patch is deployed across all replicas.

The patch is deployed across all replicas.

Spot-checking on the enwiki_p replica, it looks like the changes to image_compat, ipblocks_compat, logging_compat, oldimage_compat, protected_titles_compat, recentchanges_compat, and revision_compat did not get applied. Those views are still referencing the to-be-deleted columns.

The changes to filearchive_compat and all the non-compat tables seem to have been done.

Bstorm added a comment.May 9 2019, 6:28 PM

I see it somewhere else as well. Let me try to figure out what's up there.

Bstorm added a comment.EditedMay 9 2019, 6:35 PM

There is a bug in the script (or somewhere) that is causing it to skip rebuilding some (_compat) tables. I think I know what it is. Since we changed to inner joins, there's a bug in there I suspected early on that it might miss tables that have them.

Yes, all the skipped tables have an inner join. I'll fix the script asap and fix this. @Anomie

Bstorm added a comment.May 9 2019, 6:45 PM

Failed to find table image, comment in database frwiki as a source for view image_compat definitely the bug.

Bstorm added a comment.May 9 2019, 8:53 PM

Ok, the general problem is that a view cannot have both an inner and a left join in the same view and have the script do it (without writing a parser and packaging this separately from puppet at that point).

Change 509148 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: Fix misconfiguration in the views

https://gerrit.wikimedia.org/r/509148

Bstorm added a comment.May 9 2019, 9:21 PM

That should fix things as long as it looks ok. I tested on my local mediawiki vagrant.

Change 509148 merged by Bstorm:
[operations/puppet@production] wiki replicas: Fix misconfiguration in the views

https://gerrit.wikimedia.org/r/509148

I'm going to hold off the actual deploy until Monday in case this breaks a bunch of tools.

Aaand, I was out Monday and am catching up still. This may not get deployed until tomorrow.

For validation purposes, this is the first run of the fixed layout on a live wiki on protected_titles_compat

            VIEW `scwiki_p`.`protected_titles_compat`
            AS select pt_namespace, pt_title, pt_user, pt_reason_id, comment.comment_text as pt_reason, pt_timestamp, pt_expiry, pt_create_perm

            FROM `scwiki`.`protected_titles`,`scwiki`.`comment`
         WHERE comment_id = pt_reason_id
;

It looks right.

Starting the run on the rest of them

Change 510565 had a related patch set uploaded (by Bstorm; owner: Bstorm):
[operations/puppet@production] wiki replicas: depool labsdb1010 for view updates

https://gerrit.wikimedia.org/r/510565

Mentioned in SAL (#wikimedia-operations) [2019-05-15T16:26:45Z] <bstorm_> T212972 updated all views on labsdb1009

There is maintenance going on re: the replicas that is keeping some depooled and preventing me from proceeding for a while. @Anomie I'll need to know what level of urgency this has for you to know if we should adapt the maintenance activities around this or just wait it out. Compression is involved, so it'll be a while, I'm sure.

Mentioned in SAL (#wikimedia-operations) [2019-05-15T16:58:40Z] <bstorm_> T212972 updated all views on labsdb1012

How long is "a while"?

Probably doesn't matter too much. This is currently blocking me filing the schema change task for dropping the columns in production. But at the moment I'm hoping that the similar change for actor will be ready to go by early June (should we mail the mailing list now?), so I may wait on filing that until the actor production schema change is ready to go too.

Bstorm added a comment.EditedMay 17 2019, 3:29 PM

I've got a commitment to finish the last two replicas on Monday.

Change 510565 abandoned by Bstorm:
wiki replicas: depool labsdb1010 for view updates

Reason:
Abandoning for now. Doing stuff on Monday.

https://gerrit.wikimedia.org/r/510565

This should be all set, and I think you should be unblocked now @Anomie

Anomie closed this task as Resolved.Fri, May 24, 2:09 PM
Anomie claimed this task.

Looks good, thanks!