Page MenuHomePhabricator

Drop unused columns from revision table in DatabaseUpdater (CommentStore, Actor, MCR)
Closed, ResolvedPublic

Description

The following schema changes should be applied to the revision table when updating to 1.35:

  • Drop rev_text_id, rev_content_model, and rev_content_format that MCR obsoleted.
  • Drop ar_text_id, ar_content_model, and ar_content_format that MCR obsoleted.
  • Replace rev_comment with rev_comment_id.
  • Replace rev_user and rev_user_text with rev_actor, plus associated index changes.

Event Timeline

Change 552339 had a related patch set uploaded (by Daniel Kinzler; owner: Anomie):
[mediawiki/core@master] Alter revision for actor, comment, and MCR

https://gerrit.wikimedia.org/r/552339

Nice to see this happening - is this a duplicate of T238966 though?
Reminder: once https://gerrit.wikimedia.org/r/#/c/552339/ is merged and you are confident we can go ahead with the schema changes in production, create a task following: https://wikitech.wikimedia.org/wiki/Schema_changes#Workflow_of_a_schema_change

Thank you!

Nice to see this happening - is this a duplicate of T238966 though?

No. This ticket is about dropping the fields when update.php is run. T238966 is about production DBs.

Reminder: once https://gerrit.wikimedia.org/r/#/c/552339/ is merged and you are confident we can go ahead with the schema changes in production, create a task following: https://wikitech.wikimedia.org/wiki/Schema_changes#Workflow_of_a_schema_change

Will do.

Nice to see this happening - is this a duplicate of T238966 though?

No. This ticket is about dropping the fields when update.php is run. T238966 is about production DBs.

Gotcha! Thank you!

daniel triaged this task as High priority.May 8 2020, 10:12 AM

Change 552339 merged by jenkins-bot:
[mediawiki/core@master] Alter revision for actor, comment, and MCR

https://gerrit.wikimedia.org/r/552339

$ diff -y before after
> DESCRIBE revision;						> DESCRIBE revision;
stdClass Object							stdClass Object
(								(
    [Field] => rev_id						    [Field] => rev_id
    [Type] => int(10) unsigned					    [Type] => int(10) unsigned
    [Null] => NO						    [Null] => NO
    [Key] => PRI						    [Key] => PRI
    [Default] =>						    [Default] =>
    [Extra] => auto_increment					    [Extra] => auto_increment
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_page						    [Field] => rev_page
    [Type] => int(10) unsigned					    [Type] => int(10) unsigned
    [Null] => NO						    [Null] => NO
    [Key] => MUL						    [Key] => MUL
    [Default] =>						    [Default] =>
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_text_id				      |	    [Field] => rev_comment_id
    [Type] => int(10) unsigned				      |	    [Type] => bigint(20) unsigned
    [Null] => NO						    [Null] => NO
    [Key] =>							    [Key] =>
    [Default] => 0						    [Default] => 0
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_comment				      |	    [Field] => rev_actor
    [Type] => varbinary(767)				      |	    [Type] => bigint(20) unsigned
    [Null] => NO					      <
    [Key] =>						      <
    [Default] =>					      <
    [Extra] =>						      <
)							      <
stdClass Object						      <
(							      <
    [Field] => rev_user					      <
    [Type] => int(10) unsigned				      <
    [Null] => NO						    [Null] => NO
    [Key] => MUL						    [Key] => MUL
    [Default] => 0						    [Default] => 0
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_user_text				      <
    [Type] => varbinary(255)				      <
    [Null] => NO					      <
    [Key] => MUL					      <
    [Default] =>					      <
    [Extra] =>						      <
)							      <
stdClass Object						      <
(							      <
    [Field] => rev_timestamp					    [Field] => rev_timestamp
    [Type] => binary(14)					    [Type] => binary(14)
    [Null] => NO						    [Null] => NO
    [Key] => MUL						    [Key] => MUL
    [Default] =>						    [Default] =>
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_minor_edit					    [Field] => rev_minor_edit
    [Type] => tinyint(3) unsigned				    [Type] => tinyint(3) unsigned
    [Null] => NO						    [Null] => NO
    [Key] =>							    [Key] =>
    [Default] => 0						    [Default] => 0
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_deleted					    [Field] => rev_deleted
    [Type] => tinyint(3) unsigned				    [Type] => tinyint(3) unsigned
    [Null] => NO						    [Null] => NO
    [Key] =>							    [Key] =>
    [Default] => 0						    [Default] => 0
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_len						    [Field] => rev_len
    [Type] => int(10) unsigned					    [Type] => int(10) unsigned
    [Null] => YES						    [Null] => YES
    [Key] =>							    [Key] =>
    [Default] =>						    [Default] =>
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_parent_id					    [Field] => rev_parent_id
    [Type] => int(10) unsigned					    [Type] => int(10) unsigned
    [Null] => YES						    [Null] => YES
    [Key] =>							    [Key] =>
    [Default] =>						    [Default] =>
    [Extra] =>							    [Extra] =>
)								)
stdClass Object							stdClass Object
(								(
    [Field] => rev_sha1						    [Field] => rev_sha1
    [Type] => varbinary(32)					    [Type] => varbinary(32)
    [Null] => NO						    [Null] => NO
    [Key] =>							    [Key] =>
    [Default] =>						    [Default] =>
    [Extra] =>							    [Extra] =>
)								)
stdClass Object						      <
(							      <
    [Field] => rev_content_model			      <
    [Type] => varbinary(32)				      <
    [Null] => YES					      <
    [Key] =>						      <
    [Default] =>					      <
    [Extra] =>						      <
)							      <
stdClass Object						      <
(							      <
    [Field] => rev_content_format			      <
    [Type] => varbinary(64)				      <
    [Null] => YES					      <
    [Key] =>						      <
    [Default] =>					      <
    [Extra] =>						      <
)							      <

I think we can declare this Resolved. Though we should add some special release notes.

I think we can declare this Resolved. Though we should add some special release notes.

The release notes for 1.35 already have this blurb:

1.35 has several database changes since 1.34, and will not work without schema 
updates. Note that due to changes to some very large tables like the revision
table, the schema update may take quite long (minutes on a medium sized site,
many hours on a large site).

Don't forget to always back up your database before upgrading!

I'll add a more specific note about the schema changes.

Change 595254 had a related patch set uploaded (by Daniel Kinzler; owner: Daniel Kinzler):
[mediawiki/core@master] Release notes for revision table changes

https://gerrit.wikimedia.org/r/595254

Change 595254 merged by jenkins-bot:
[mediawiki/core@master] Release notes for revision table changes

https://gerrit.wikimedia.org/r/595254

@daniel question for you:
Just came across postgres/archives/patch-textsearch_bug66650.sql:

UPDATE /*_*/pagecontent SET textvector=to_tsvector(old_text)
WHERE textvector IS NULL AND old_id IN 
(SELECT  max(rev_text_id) FROM revision GROUP BY rev_page);

INSERT INTO /*_*/updatelog(ul_key) VALUES ('patch-textsearch_bug66650.sql');

Will this be broken now that rev_text_id is being removed?

Will this be broken now that rev_text_id is being removed?

Yes.

I don't know how the default fulltext search works in PostGres, but feeding the fulltext index needs to be done programmatically. I'd assume we have code for this already, for use with MySQL.