Page MenuHomePhabricator

Deploy schema change for adding numeric primary key to wbqc_constraints table
Closed, ResolvedPublicSep 9 2018

Description

We would like to add a numeric primary key to the wbqc_constraints table (replacing the current string as the primary key). As far as we’re concerned, this can be done at any time and isn’t urgent.

  • Full change: https://gerrit.wikimedia.org/r/392458, adds one column and one index (for the column which was previously the primary key).
  • ALTER TABLE: P7505
  • Where: wikidatawiki (s8), testwikidatawiki (s3).
  • When: at any time.
  • Backwards compatible: yes.
  • Tested already: locally and on deployment-prep.
  • Should be made available on labs replicas and/or dumps: yes, no sensitive data at all. (The full table is already available.)

Progress:

s3 (testwikidatawiki)

  • eqiad
  • codfw

s8 (wikidatawiki)

  • eqiad
  • codfw

Event Timeline

This is not an easy change as it needs to be done directly on the master, or otherwise ids will not match.
The table is pretty small (23M on disk - 16224 rows) so it can probably be done directly on the master. I would still prefer to go for the approach we took for other similar changes like: change_tag, tag_summary and watchlist tables which is:

  • Do the alter on codfw
  • Wait for the DC failover
  • Reimport the tables on eqiad.

I prefer that approach because:

-It is not an urgent request and failover might be happening on Q4, so not far away from now.

  • I prefer avoid doing such changes (adding auto_inc columns) directly on the master and getting it replicated thru all the replication chain if there is an alternative way.

failover might be happening on Q4, so not far away from now.

Just to clarify, is that the same as “FQ4 (april-june 2018)” mentioned in T186320#3940867? I was confused for a second – October-December 2018 felt fairly far away to me :)

WMF fiscal and annual year starts from July, so it's April-June thing and not October-December. The DC failovers happens around June-July all the time.

Okay, thanks :) waiting for that seems totally okay to me.

Just an update: we might go for a DC failover during Q1. Still to be confirmed, but if it is the case, we could try to go for the approach described at: T189101#4031549

@Lucas_Werkmeister_WMDE What is the plan with constraint_guid? Will you drop it? Shouldn't it keep the UNIQUE constraint while it is in use (and by extension, the index)? We need a quick answer if this is to be done in September.

constraint_guid should be kept. The SQL patch above re-adds the index on it, but not the UNIQUE constraint, so we’ll have to fix that. Thanks!

Please if you could give this a lot of priority, because if we miss this "train" (switchover) we could be waiting for a full extra year. Thanks. PK changes are not easy to do normally.

WMDE-leszek raised the priority of this task from Medium to High.Aug 27 2018, 2:16 PM
WMDE-leszek added a project: Wikidata-Campsite.
WMDE-leszek set Due Date to Sep 9 2018, 10:00 PM.
Restricted Application changed the subtype of this task from "Task" to "Deadline". · View Herald TranscriptAug 27 2018, 2:17 PM

Change 456607 had a related patch set uploaded (by Lucas Werkmeister (WMDE); owner: Lucas Werkmeister (WMDE)):
[mediawiki/extensions/WikibaseQualityConstraints@master] Make index on constraint_guid unique

https://gerrit.wikimedia.org/r/456607

Change 456607 merged by jenkins-bot:
[mediawiki/extensions/WikibaseQualityConstraints@master] Make index on constraint_guid unique

https://gerrit.wikimedia.org/r/456607

Lucas_Werkmeister_WMDE moved this task from Ready to estimate to Done on the Wikidata-Campsite board.

I’ve updated the task description with a new SQL snippet that summarizes both schema changes into one.

Thanks! We will try to get this done today or tomorrow!

Change 457363 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] filtered_tables: Add constraint_id column

https://gerrit.wikimedia.org/r/457363

Mentioned in SAL (#wikimedia-operations) [2018-09-03T07:34:29Z] <marostegui> Deploy schema change on s3.testwikidatawiki on codfw master - T189101

Mentioned in SAL (#wikimedia-operations) [2018-09-03T07:39:43Z] <marostegui> Deploy schema change on s8.wikidatawiki on codfw master - T189101

I have done this on codfw (testwikidatawiki and wikidatawiki).
So once eqiad is passive we will reimport the tables there and we'll be able to close this task.

This is how the table looks like now:

root@db2045.codfw.wmnet[wikidatawiki]> show create table wbqc_constraints\G
*************************** 1. row ***************************
       Table: wbqc_constraints
Create Table: CREATE TABLE `wbqc_constraints` (
  `constraint_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `constraint_guid` varbinary(63) NOT NULL,
  `pid` int(11) NOT NULL,
  `constraint_type_qid` varbinary(25) NOT NULL,
  `constraint_parameters` blob,
  PRIMARY KEY (`constraint_id`),
  UNIQUE KEY `wbqc_constraints_guid_uniq` (`constraint_guid`),
  KEY `wbqc_constraints_pid_index` (`pid`)
) ENGINE=InnoDB AUTO_INCREMENT=20911 DEFAULT CHARSET=binary

Change 457363 merged by Marostegui:
[operations/puppet@production] filtered_tables: Add constraint_id column

https://gerrit.wikimedia.org/r/457363

Change 460342 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Depool db2085:3318

https://gerrit.wikimedia.org/r/460342

Change 460342 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Depool db2085:3318

https://gerrit.wikimedia.org/r/460342

Mentioned in SAL (#wikimedia-operations) [2018-09-13T12:54:50Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Depool db2085:3318 - T189101 (duration: 00m 49s)

Mentioned in SAL (#wikimedia-operations) [2018-09-13T12:55:15Z] <marostegui> Stop db1071 (s8 eqiad master) and db2085:3318 in sync - T189101

The table has been imported in eqiad hosts for wikidatawiki

root@neodymium:/home/marostegui# mysql.py -hdb1071 wikidatawiki -e "show create table wbqc_constraints\G"
*************************** 1. row ***************************
       Table: wbqc_constraints
Create Table: CREATE TABLE `wbqc_constraints` (
  `constraint_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `constraint_guid` varbinary(63) NOT NULL,
  `pid` int(11) NOT NULL,
  `constraint_type_qid` varbinary(25) NOT NULL,
  `constraint_parameters` blob,
  PRIMARY KEY (`constraint_id`),
  UNIQUE KEY `wbqc_constraints_guid_uniq` (`constraint_guid`),
  KEY `wbqc_constraints_pid_index` (`pid`)
) ENGINE=InnoDB AUTO_INCREMENT=22320 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

Mentioned in SAL (#wikimedia-operations) [2018-09-13T13:03:32Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Repool db2085:3318 - T189101 (duration: 00m 49s)

Change 460470 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-codfw.php: Depool db2050

https://gerrit.wikimedia.org/r/460470

Change 460470 merged by jenkins-bot:
[operations/mediawiki-config@master] db-codfw.php: Depool db2050

https://gerrit.wikimedia.org/r/460470

Mentioned in SAL (#wikimedia-operations) [2018-09-14T05:16:51Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Depool db2050 - T189101 (duration: 00m 49s)

Mentioned in SAL (#wikimedia-operations) [2018-09-14T05:16:59Z] <marostegui> Stop replication in sync on db1075 and db2050 - T189101

Table imported on testwikidatawiki on eqiad hosts:

root@db1075.eqiad.wmnet[testwikidatawiki]> show create table wbqc_constraints;
+------------------+-------------------------------------------------------------------------------------------------------------------
| Table            | Create Table
+------------------+-------------------------------------------------------------------------------------------------------------------
| wbqc_constraints | CREATE TABLE `wbqc_constraints` (
  `constraint_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `constraint_guid` varbinary(255) NOT NULL,
  `pid` int(11) NOT NULL,
  `constraint_type_qid` varbinary(255) NOT NULL,
  `constraint_parameters` blob,
  PRIMARY KEY (`constraint_id`),
  UNIQUE KEY `wbqc_constraints_guid_uniq` (`constraint_guid`),
  KEY `wbqc_constraints_pid_index` (`pid`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=binary |
Marostegui updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2018-09-14T05:29:02Z] <marostegui@deploy1001> Synchronized wmf-config/db-codfw.php: Repool db2050 - T189101 (duration: 00m 49s)