Page MenuHomePhabricator

Add wl_id to watchlist tables on production dbs
Closed, ResolvedPublic

Description

The ALTER TABLES to run: https://gerrit.wikimedia.org/r/#/c/271435/5/maintenance/archives/patch-watchlist-wl_id.sql

Where to run those changes: All mediawiki dbs

When to run those changes: ASAP

If the schema change is backwards compatible: compatible with the current code deployed.

If the schema change has been tested already on some of the test/beta wikis. Usually, as a last test, change should be applied
to testwiki first: We will be testing on Labs, Beta and testwiki (along with deployment). Nothing actually uses the id field yet so this basically should be a noop

If it involves new columns or tables, if the data should be made available on the labs replicas. Similar question if it involves deletion of data previously available on labs. YES

Status as of 2017-04-27: Done on eqiad, pending on codfw. Do not deploy any code based on it yet.

  • Status in codfw as of 2017-05-09
  • s1
  • s2
  • s3
  • s4
  • s5
  • s6
  • s7
  • silver
  • labtestweb2001
  • Check masters and hosts on both DCs as a final check

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Watchlist change should work despite not being deployed on codfw:

mysql1> set sql_log_bin=0;
Query OK, 0 rows affected (0.00 sec)

mysql1> ALTER TABLE /*$wgDBprefix*/watchlist
    ->   ADD COLUMN wl_id int unsigned NOT NULL AUTO_INCREMENT FIRST,
    ->   ADD PRIMARY KEY (wl_id);
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql1> SELECT * FROM watchlist;
+-------+---------+--------------+-----------+--------------------------+
| wl_id | wl_user | wl_namespace | wl_title  | wl_notificationtimestamp |
+-------+---------+--------------+-----------+--------------------------+
|     1 |       1 |            0 | Main_page | 20170424204319           |
|     2 |       1 |            0 | Wikipedia | 20170424204344           |
+-------+---------+--------------+-----------+--------------------------+
2 rows in set (0.00 sec)

mysql2> SELECT * FROM watchlist;
+---------+--------------+-----------+--------------------------+
| wl_user | wl_namespace | wl_title  | wl_notificationtimestamp |
+---------+--------------+-----------+--------------------------+
|       1 |            0 | Main_page | 20170424204319           |
|       1 |            0 | Wikipedia | 20170424204344           |
+---------+--------------+-----------+--------------------------+
2 rows in set (0.00 sec)

[restart client from mysql1]

mysql1> insert into watchlist (wl_user, wl_namespace, wl_title, wl_notificationtimestamp) VALUES (1, 0, 'Error?', '20170424204855');
Query OK, 1 row affected (0.00 sec)

mysql1> SELECT * FROM watchlist;
+-------+---------+--------------+-----------+--------------------------+
| wl_id | wl_user | wl_namespace | wl_title  | wl_notificationtimestamp |
+-------+---------+--------------+-----------+--------------------------+
|     1 |       1 |            0 | Main_page | 20170424204319           |
|     2 |       1 |            0 | Wikipedia | 20170424204344           |
|     4 |       1 |            0 | Error?    | 20170424204855           |
+-------+---------+--------------+-----------+--------------------------+
3 rows in set (0.00 sec)

mysql2> SELECT * FROM watchlist;
+---------+--------------+-----------+--------------------------+
| wl_user | wl_namespace | wl_title  | wl_notificationtimestamp |
+---------+--------------+-----------+--------------------------+
|       1 |            0 | Main_page | 20170424204319           |
|       1 |            0 | Wikipedia | 20170424204344           |
|       1 |            0 | Error?    | 20170424204855           |
+---------+--------------+-----------+--------------------------+
3 rows in set (0.00 sec)

Mentioned in SAL (#wikimedia-operations) [2017-04-25T06:34:12Z] <marostegui> Deploy alter table on s3, all the wikis to the watchlist table on db1075, eqiad master - T130067

Thanks @jcrespo for testing it!
I have altered all s3 wikis (that have watchlist table) now on the master.
There were some wikis that already had the right definition and complained ERROR 1060 (42S21) at line 1: Duplicate column name 'wl_id':

arbcom_cswiki
dtywiki
ecwikimedia
fiwikivoyage
jamwiki
katesdb
olowiki
pawikisource
projectcomwiki
ptwikimedia
quwiktionary
tcywiki
wbwikimedia

I have double check those and they are indeed correct.
I am going to move now to s6

s6 is done. db1093 already had the correct schema in all the databases so it failed. I skipped that transaction:

Last_SQL_Error: Error 'Multiple primary key defined' on query. Default database: 'frwiki'. Query: 'alter table watchlist
| watchlist | CREATE TABLE `watchlist` (
  `wl_user` int(5) unsigned NOT NULL DEFAULT '0',
  `wl_namespace` int(11) NOT NULL DEFAULT '0',
  `wl_title` varbinary(255) NOT NULL DEFAULT '',
  `wl_notificationtimestamp` varbinary(14) DEFAULT NULL,
  PRIMARY KEY (`wl_user`,`wl_namespace`,`wl_title`),
  KEY `namespace_title` (`wl_namespace`,`wl_title`),
  KEY `wl_user_notificationtimestamp` (`wl_user`,`wl_notificationtimestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=binary

Mentioned in SAL (#wikimedia-operations) [2017-04-25T08:52:39Z] <marostegui> Deploy alter table s4 commonswiki.watchlist directly on db1068 (eqiad master) - T130067

Mentioned in SAL (#wikimedia-operations) [2017-04-25T08:56:29Z] <marostegui> Stop replication on db1088 and db1093 in sync - T130067

Thanks @jcrespo for testing it!
I have altered all s3 wikis (that have watchlist table) now on the master.
There were some wikis that already had the right definition and complained ERROR 1060 (42S21) at line 1: Duplicate column name 'wl_id':

I guess these are wikis that have been created since wl_id was merged in core (which was quite some time ago)

Thanks @jcrespo for testing it!
I have altered all s3 wikis (that have watchlist table) now on the master.
There were some wikis that already had the right definition and complained ERROR 1060 (42S21) at line 1: Duplicate column name 'wl_id':

I guess these are wikis that have been created since wl_id was merged in core (which was quite some time ago)

Yep, most likely. As those were like that on the master already, there is no need to reimport the tables again.

db1093 watchlist table from frwiki, jawiki and ruwiki has been re-imported.

s4 is done
db1091 had the schema already there, so I will reimport its table once the other slaves have caught up

Mentioned in SAL (#wikimedia-operations) [2017-04-25T09:46:54Z] <marostegui> Deploy alter table s2 on watchlist table directly on the master (db1054) - T130067

Mentioned in SAL (#wikimedia-operations) [2017-04-25T09:49:14Z] <marostegui> Stop replication in sync on db1091 and db1084 for maintenance - T130067

s4 is done
db1091 had the schema already there, so I will reimport its table once the other slaves have caught up

This has been done. db1091 has the table reimported.

s2 is done. db1090 needs to get the table reimported.

Mentioned in SAL (#wikimedia-operations) [2017-04-25T10:39:03Z] <marostegui> Stop replication in sync on db1090 and db1076 for maintenance - https://phabricator.wikimedia.org/T130067

db1090 has been reimported from db1076

Mentioned in SAL (#wikimedia-operations) [2017-04-25T11:27:56Z] <marostegui> Deploy alter table s1 on watchlist table directly on the master (db1052) - https://phabricator.wikimedia.org/T130067

Mentioned in SAL (#wikimedia-operations) [2017-04-25T11:46:58Z] <marostegui> Deploy alter table s5 on watchlist table directly on the master (db1049) - https://phabricator.wikimedia.org/T130067

s5 is done. db1092 needs its table reimported.

Mentioned in SAL (#wikimedia-operations) [2017-04-25T12:35:52Z] <marostegui> Stop replication in sync on db1092 and db1087 for maintenance - https://phabricator.wikimedia.org/T130067

s5 is done. db1092 needs its table reimported.

table reimported on db1092 from db1087

Mentioned in SAL (#wikimedia-operations) [2017-04-25T13:15:23Z] <marostegui> Deploy alter table on silver.watchlist and labtestweb2001.labtestwiki for the watchlist table - T130067

I have altered labstestweb2001 and silver too to avoid the issues encountered at: T155605 yesterday (column missing on those two hosts and after deployment the start throwing errors)

s1 is done. db1089 needs its table reimported, I will do so once the slaves have caught up

Mentioned in SAL (#wikimedia-operations) [2017-04-25T14:17:05Z] <marostegui> Stop replication in sync on db1089 and db1083 for maintenance - https://phabricator.wikimedia.org/T130067

dbstore1001 just got the change for etwiki fine (which was done yesterday). So tomorrow I will expect all the changes to go fine (apart from the ones that failed on dbstore1002 for databases that no longer existed there)

Mentioned in SAL (#wikimedia-operations) [2017-04-25T15:14:35Z] <marostegui> Deploy alter table s7 on watchlist table directly on the master (db1062) - https://phabricator.wikimedia.org/T130067

s1 is done. db1089 needs its table reimported, I will do so once the slaves have caught up

s1: db1089's watchlist table has been reimported from db1083

s7 is done. db1094 needs the table reimported

Mentioned in SAL (#wikimedia-operations) [2017-04-25T15:46:25Z] <marostegui> Stop replication on db1086 and db1094 in sync - https://phabricator.wikimedia.org/T130067

s7 is done. db1094 needs the table reimported

db1094 tables have been reimported from db1086.

I believe all the eqiad shards are now done, I will double check all the wikis tomorrow morning again, to make sure nothing was left aside

As a note for @Addshore - the field should not be used or depend on it until tables are reimported on codfw, which cannot happen before we failback to eqiad.

As a note for @Addshore - the field should not be used or depend on it until tables are reimported on codfw, which cannot happen before we failback to eqiad.

Acknowledged

I have double checked all the shards now and the wikis that contain that table, they all have the PRIMARY key now in eqiad.

Mentioned in SAL (#wikimedia-operations) [2017-04-27T06:08:59Z] <marostegui> Deploy alter table on s5 (wikidatawiki) on db1049 - T130067 T162539

Mentioned in SAL (#wikimedia-operations) [2017-04-27T06:11:00Z] <marostegui> Deploy alter table on s5 (wikidatawiki) on db1070 (running locally instead of neodymium as this host will be affected by the network maintenance) - T130067 T162539

Ignore the last two posts from Stashbot, it was for another ticket number

Mentioned in SAL (#wikimedia-operations) [2017-05-09T07:27:30Z] <marostegui> Disable replication codfw > eqiad on s6 - T147166 T130067

Mentioned in SAL (#wikimedia-operations) [2017-05-09T07:31:22Z] <marostegui> Stop replication at the same position on db1050 and db2028 - T147166 T130067

s6 codfw is done:

root@neodymium:/home/marostegui/git/software/dbtools# for i in frwiki jawiki ruwiki; do echo $i; mysql --skip-ssl -hdb2028.codfw.wmnet $i -e "show create table watchlist\G" | grep PRIMARY;done
frwiki
  PRIMARY KEY (`wl_id`),
jawiki
  PRIMARY KEY (`wl_id`),
ruwiki
  PRIMARY KEY (`wl_id`),

Change 352767 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1097

https://gerrit.wikimedia.org/r/352767

Change 352767 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1097

https://gerrit.wikimedia.org/r/352767

Mentioned in SAL (#wikimedia-operations) [2017-05-09T09:37:12Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1097 - T147166 T130067 (duration: 00m 41s)

s4 in codfw is done:

root@neodymium:/home/marostegui/git/software/dbtools#  mysql --skip-ssl -hdb2019.codfw.wmnet commonswiki -e "show create table watchlist\G " | grep PRIMARY
  PRIMARY KEY (`wl_id`),

Mentioned in SAL (#wikimedia-operations) [2017-05-09T11:27:15Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Restore original weight for db1097 - T147166 T130067 (duration: 00m 39s)

s5 in codfw is done:

root@neodymium:/home/marostegui/git/software/dbtools# for i in dewiki wikidatawiki; do echo $i; mysql --skip-ssl -hdb2023.codfw.wmnet $i -e "show create table watchlist\G " | grep PRIMARY;done
dewiki
  PRIMARY KEY (`wl_id`),
wikidatawiki
  PRIMARY KEY (`wl_id`),

Mentioned in SAL (#wikimedia-operations) [2017-05-10T07:16:04Z] <marostegui> Disable replication codfw > eqiad on s2 -T147166 T130067

Mentioned in SAL (#wikimedia-operations) [2017-05-10T07:30:41Z] <marostegui> Stop replication at the same position on db10418 and db2017 - T147166 https://phabricator.wikimedia.org/T130067

s2 in codfw is done:

root@neodymium:/home/marostegui/git/software/dbtools# for i in `cat /home/marostegui/s7_dbs`; do echo $i; mysql --skip-ssl -hdb2017.codfw.wmnet $i -e "show create table watchlist\G"| grep PRIMARY;done
bgwiki
  PRIMARY KEY (`wl_id`),
bgwiktionary
  PRIMARY KEY (`wl_id`),
cswiki
  PRIMARY KEY (`wl_id`),
enwikiquote
  PRIMARY KEY (`wl_id`),
enwiktionary
  PRIMARY KEY (`wl_id`),
eowiki
  PRIMARY KEY (`wl_id`),
fiwiki
  PRIMARY KEY (`wl_id`),
idwiki
  PRIMARY KEY (`wl_id`),
itwiki
  PRIMARY KEY (`wl_id`),
nlwiki
  PRIMARY KEY (`wl_id`),
nowiki
  PRIMARY KEY (`wl_id`),
plwiki
  PRIMARY KEY (`wl_id`),
ptwiki
  PRIMARY KEY (`wl_id`),
svwiki
  PRIMARY KEY (`wl_id`),
thwiki
  PRIMARY KEY (`wl_id`),
trwiki
  PRIMARY KEY (`wl_id`),
zhwiki
  PRIMARY KEY (`wl_id`),

Mentioned in SAL (#wikimedia-operations) [2017-05-10T10:43:08Z] <marostegui> Disable replication codfw > eqiad on s7 - T147166 T130067

Mentioned in SAL (#wikimedia-operations) [2017-05-10T10:50:51Z] <marostegui> Stop replication at the same position on db1033 and db2029 - T147166 T130067

s7 in codfw is done:

root@neodymium:~# for i in `cat /home/marostegui/s2_dbs`; do echo $i; mysql --skip-ssl -hdb2029.codfw.wmnet $i -e "show create table watchlist\G"| grep PRIMARY;done
arwiki
  PRIMARY KEY (`wl_id`),
cawiki
  PRIMARY KEY (`wl_id`),
eswiki
  PRIMARY KEY (`wl_id`),
fawiki
  PRIMARY KEY (`wl_id`),
frwiktionary
  PRIMARY KEY (`wl_id`),
hewiki
  PRIMARY KEY (`wl_id`),
huwiki
  PRIMARY KEY (`wl_id`),
kowiki
  PRIMARY KEY (`wl_id`),
metawiki
  PRIMARY KEY (`wl_id`),
rowiki
  PRIMARY KEY (`wl_id`),
ukwiki
  PRIMARY KEY (`wl_id`),
viwiki
  PRIMARY KEY (`wl_id`),

Change 353061 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1067

https://gerrit.wikimedia.org/r/353061

Change 353061 merged by Marostegui:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1067

https://gerrit.wikimedia.org/r/353061

Mentioned in SAL (#wikimedia-operations) [2017-05-10T14:50:01Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1067 - T147166 T130067 (duration: 00m 43s)

Mentioned in SAL (#wikimedia-operations) [2017-05-10T15:32:54Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1067 - T147166 T130067 (duration: 01m 43s)

Mentioned in SAL (#wikimedia-operations) [2017-05-11T05:56:54Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1067 - T147166 T130067 (duration: 00m 57s)

s1 in codfw is done:

root@neodymium:~#  mysql --skip-ssl -hdb2016.codfw.wmnet enwiki -e "show create table watchlist\G"| grep PRIMARY
  PRIMARY KEY (`wl_id`),

s3 in codfw is done - small example

root@db2018:/tmp# for i in `mysql --skip-ssl -e "select table_schema from information_schema.tables where table_name = 'watchlist';" -BN`; do echo $i; mysql --skip-ssl $i -e "show create table watchlist\G" | grep PRIMARY;done
aawiki
  PRIMARY KEY (`wl_id`),
aawikibooks
  PRIMARY KEY (`wl_id`),
aawiktionary
  PRIMARY KEY (`wl_id`),
abwiki
  PRIMARY KEY (`wl_id`),
abwiktionary
  PRIMARY KEY (`wl_id`),
acewiki
  PRIMARY KEY (`wl_id`),
advisorywiki
  PRIMARY KEY (`wl_id`),
adywiki
  PRIMARY KEY (`wl_id`),
afwiki
  PRIMARY KEY (`wl_id`),
afwikibooks
  PRIMARY KEY (`wl_id`),
afwikiquote
  PRIMARY KEY (`wl_id`),
afwiktionary
  PRIMARY KEY (`wl_id`),
akwiki
  PRIMARY KEY (`wl_id`),
akwikibooks
  PRIMARY KEY (`wl_id`),
akwiktionary
  PRIMARY KEY (`wl_id`),
alswiki
  PRIMARY KEY (`wl_id`),
alswikibooks
  PRIMARY KEY (`wl_id`),

I am now going to review all the masters on both dcs to make sure they all have it before closing this ticket.

Everything is looking good.
The only host which still doesn't have the new column is dbstore2001 as it is our delayed slave. It should start getting it tomorrow

Everything is looking good.
The only host which still doesn't have the new column is dbstore2001 as it is our delayed slave. It should start getting it tomorrow

Awesome! so we are ready to start using this in mediawiki? :D

Everything is looking good.
The only host which still doesn't have the new column is dbstore2001 as it is our delayed slave. It should start getting it tomorrow

Awesome! so we are ready to start using this in mediawiki? :D

Yes, you can. But if you don't mind waiting 24 hours so it gets done on dbstore2001 too, that would be nice!
Thanks