Page MenuHomePhabricator

Add local_user_id and global_user_id fields to localuser table in centralauth database
Closed, ResolvedPublic5 Estimated Story Points

Description

For cross-wiki watchlists we will need to be able to get all of the local user IDs for a user without joining to all the local user tables. While we're at it, we should add the global user IDs as well.

  • The ALTER TABLES to run: https://gerrit.wikimedia.org/r/#/c/304811/10/db_patches/patch-lu_local_id.sql
  • Where to run those changes: s7 core shard
  • When to run those changes: ASAP
  • If the schema change is backwards compatible: compatible with the current code deployed.
  • If the schema change has been tested already on some of the test/beta wikis: It has been applied and tested on Beta Cluster. It isn't possible to test on test wiki beforehand since all production wikis share a single localuser centralauth table (AFAIK).
  • If it involves new columns or tables, if the data should be made available on the labs replicas: It adds two new columns to the localuser table. These columns should have the same replication status as the rest of the table, which is to allow replication to labs.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

During E235: ArchCom RFC Meeting W29: Devise plan for a cross-wiki watchlist back-end (2016-07-20, #wikimedia-office) @Anomie and @tstarling suggested not including the local user ID was a design flaw in CentralAuth that should be rectified. They didn't speak as strongly about including the global user ID but one of them suggested we might as well do it while we're in there.

kaldari updated the task description. (Show Details)

Current CA schema: https://phabricator.wikimedia.org/diffusion/ECAU/browse/master/central-auth.sql

New columns would be added to the localuser table (and possibly localnames?).

DannyH triaged this task as Medium priority.Aug 9 2016, 5:41 PM
DannyH set the point value for this task to 5.

Change 304811 had a related patch set uploaded (by Niharika29):
[WIP] Add local ID and global ID to localuser table

https://gerrit.wikimedia.org/r/304811

Questions:

  • What new indexes, if any, should I create for the table?
  • Is the migration patch named appropriately? I couldn't find a pattern in the previous ones.
  • The int(10) holds for both local ID and global ID, right?

Questions:

  • What new indexes, if any, should I create for the table?

I think we will want a unique index on (lu_wiki, lu_local_id) to mirror the current primary index on (lu_wiki, lu_name). That would allow efficient lookup by local id.

  • Is the migration patch named appropriately? I couldn't find a pattern in the previous ones.

Looks ok to me. I agree that there isn't a clear naming convention here. I also don't see any mechanism for describing when migration has been introduced or automatically applying it.

  • The int(10) holds for both local ID and global ID, right?

The (10) after int actually doesn't change anything at all in the MySQL storage layer. It's a width hint for display of ZEROFILL values. Since we are not using ZEROFILL in the declaration it does nothing, but also doesn't hurt anything.

The important part for our user ids is UNSIGNED which doubles the max stored value to 4,294,967,295. The declaration from mediawiki/core is int unsigned. Your INT(10) UNSIGNED is equivalent.

Questions:

  • What new indexes, if any, should I create for the table?

I think we will want a unique index on (lu_wiki, lu_local_id) to mirror the current primary index on (lu_wiki, lu_name). That would allow efficient lookup by local id.

We might also want to make that UNIQUE to assert that there's only one wiki+id pair.

An index on lu_global_id might also make sense for joining with the globaluser table. If we do (lu_global_id,lu_wiki) it could also be UNIQUE since there should be only one local user per global ID per wiki.

In general, look at what queries get changed to use the new columns and see that an appropriate index exists. It's probably not a big deal to have to add new indexes with a later patch that actually changes the queries.

We might also want to make that UNIQUE to assert that there's only one wiki+id pair.

An index on lu_global_id might also make sense for joining with the globaluser table. If we do (lu_global_id,lu_wiki) it could also be UNIQUE since there should be only one local user per global ID per wiki.

Creating unique indexes caused existing tests to fail. Will modify.

From my chat with Reedy on IRC, not setting a default for the id fields will cause problems on production when altering the schema. He suggested a 2 step process: alter schema and populate then columns, and then add unique indexes. I modified the patch to have default null value for the columns.

From my chat with Reedy on IRC, not setting a default for the id fields will cause problems on production when altering the schema. He suggested a 2 step process: alter schema and populate then columns, and then add unique indexes. I modified the patch to have default null value for the columns.

I suspect, it's not worth adding any indexes, if we're just going to have to drop them, and recreate them as unique... Unless mysql has some way of nicely changing to UNIQUE

You cannot "update" indexes, however, dropping one and creating another is as fast (or as slow) as adding one or deleting one separatelly (it is done at the same time). This is mostly true for any table operation (that is why sometimes enqueuing actions is a big win).

Change 304811 merged by jenkins-bot:
Schema change: Add local ID and global ID to localuser table

https://gerrit.wikimedia.org/r/304811

I'm going to move this to Done for Community Tech (since the SQL patch is merged), but leave the ticket open since the schema change still has to be manually run by the DBAs. @jcrespo: Let me know if you have any ETA for running this patch in production.

kaldari renamed this task from Add local_user_id and global_user_id to localuser table in centralauth database to Add local_user_id and global_user_id fields to localuser table in centralauth database.Sep 9 2016, 6:53 PM
kaldari reassigned this task from Niharika to jcrespo.
kaldari moved this task from Needs Review/Feedback to Q1 2018-19 on the Community-Tech-Sprint board.
kaldari updated the task description. (Show Details)
kaldari added a subscriber: Niharika.
jcrespo removed a project: DBA.

@kaldari Please read the workflow for schema changes on production: https://wikitech.wikimedia.org/wiki/Schema_changes This is not a yesterday's new thing invented to make your life more complicated; it was literaly adopted as a recommendation on mediawiki's development policy: https://www.mediawiki.org/wiki/Development_policy#Database_patches with the ok of the Architecture committee.

Tip: DBAs sometimes complain that they are sent work and they do not get support or context from developers. Let's prove they are wrong by following their recommendations (no matter how stupid they look like); let's get interested about an ETA, making sure they have everything they need, let's follow up if they take more time than initially estimated, etc.

@jcrespo: Sorry, I just forgot about the workflow requirements. Thanks for reminding me.

kaldari updated the task description. (Show Details)

What is the purpose of lu_global_id?

I will own this task - going to start doing some tests to see how we can safely update this table as it is critical - will come back to you once we know.

Also, from now on, if possible @kaldari please try to create tasks in the DBA project instead of assigning it to @jcrespo directly as now with two DBAs it could be picked up by any of us.

Thank you!

Mentioned in SAL [2016-09-12T10:04:21Z] <marostegui> Testing schema change on db1039 - T141951

I am testing the following in db1039 (which is a non pooled slave):

osc_host.sh --host=db1039.eqiad.wmnet --port=3306 --db=centralauth --table=localuser --method=ddl --no-replicate "ADD COLUMN lu_local_id INT(10) UNSIGNED DEFAULT NULL, ADD COLUMN lu_global_id INT(10) UNSIGNED DEFAULT NULL";

The idea is to know how long it takes, leave it replicating for 24h and if we don't see any issues, we will proceed and do it slave by slave and finally on the master.

@kaldari I will keep you posted on the progress so you can have an ETA

What is the purpose of lu_global_id?

From my understanding: To facilitate easier mapping from local_id <-> global_id instead of joining on the name.

I found it mentioned on E235: ArchCom RFC Meeting W29: Devise plan for a cross-wiki watchlist back-end (2016-07-20, #wikimedia-office) in at least one place:

21:46:53 <anomie> I'd say add the column to the localuser table. And at the same time possibly add gu_id, instead of having to use the name to join.

The alter is finished in db1039 - it took around 1.5h to complete.
If the slave runs fine for 24h or so, we can start running it on other different slaves and keep this running.

Hello,

db1039 looks good, and so does error log, so I am going to proceed and execute the ALTER in the following servers (one by one):

dbstore2001.codfw.wmnet - done
dbstore2002.codfw.wmnet - done
db2040.codfw.wmnet - done
db2047.codfw.wmnet - done
db2054.codfw.wmnet - done
db2061.codfw.wmnet - done
db2068.codfw.wmnet - done
db2029.codfw.wmnet - done
root@neodymium:/home/marostegui# for i in dbstore2001 dbstore2002 db2040 db2047 db2054 db2061 db2068 db2029; do echo $i; mysql -h$i.codfw.wmnet centralauth -e "show create table localuser;" | egrep "lu_local_id|lu_global_id";done
dbstore2001
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary
dbstore2002
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary
db2040
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary
db2047
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary
db2054
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary
db2061
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary
db2068
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary
db2029
localuser	CREATE TABLE `localuser` (\n  `lu_wiki` varbinary(255) NOT NULL DEFAULT '',\n  `lu_name` varbinary(255) NOT NULL DEFAULT '',\n  `lu_attached_timestamp` varbinary(14) DEFAULT NULL,\n  `lu_attached_method` enum('primary','empty','mail','password','admin','new','login') DEFAULT NULL,\n  `lu_local_id` int(10) unsigned DEFAULT NULL,\n  `lu_global_id` int(10) unsigned DEFAULT NULL,\n  PRIMARY KEY (`lu_wiki`,`lu_name`),\n  KEY `lu_name` (`lu_name`,`lu_wiki`)\n) ENGINE=InnoDB DEFAULT CHARSET=binary

Mentioned in SAL (#wikimedia-operations) [2016-09-14T08:43:27Z] <marostegui> alter localuser table in db2047 - T141951

Mentioned in SAL (#wikimedia-operations) [2016-09-14T10:03:36Z] <marostegui> alter localuser table in db2054 - T141951

After getting it done in codfw I will start with eqiad hosts, this is the list:

dbstore1001.eqiad.wmnet - done
dbstore1002.eqiad.wmnet - done
labsdb1001.eqiad.wmnet - done
labsdb1003.eqiad.wmnet - done
db1069.eqiad.wmnet (port 3317 - s7) - done
db1028.eqiad.wmnet - done
db1033.eqiad.wmnet - done 
db1034.eqiad.wmnet - done
db1039.eqiad.wmnet - done
db1062.eqiad.wmnet - done
db1079.eqiad.wmnet - done
db1086.eqiad.wmnet - done
db1094.eqiad.wmnet - done
db1041.eqiad.wmnet - done

Mentioned in SAL (#wikimedia-operations) [2016-09-15T08:11:16Z] <marostegui> altering tables in S7 - eqiad hosts - T141951

Mentioned in SAL (#wikimedia-operations) [2016-09-19T11:13:06Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Temporarily depool db1062 and repool db1034, in order to be able to ALTER a large table. T141951 (duration: 00m 48s)

Change 311685 had a related patch set uploaded (by Marostegui):
db-eqiad.php: Temporarily depool db1086.

https://gerrit.wikimedia.org/r/311685

Change 311685 merged by jenkins-bot:
db-eqiad.php: Temporarily depool db1086.

https://gerrit.wikimedia.org/r/311685

Mentioned in SAL (#wikimedia-operations) [2016-09-21T08:29:05Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depooling db1086 for an alter table - T141951 (duration: 00m 49s)

Change 311942 had a related patch set uploaded (by Marostegui):
db-eqiad.php: Repool db1086 after the ALTER table

https://gerrit.wikimedia.org/r/311942

Change 311942 merged by jenkins-bot:
db-eqiad.php: Repool db1086 after the ALTER table

https://gerrit.wikimedia.org/r/311942

Mentioned in SAL (#wikimedia-operations) [2016-09-21T11:16:27Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1086 after the ALTER table - T141951 (duration: 00m 47s)

Change 311968 had a related patch set uploaded (by Marostegui):
db-eqiad.php: Temporarily depool db1094 for ALTER

https://gerrit.wikimedia.org/r/311968

Change 311968 merged by jenkins-bot:
db-eqiad.php: Temporarily depool db1094 for ALTER

https://gerrit.wikimedia.org/r/311968

Mentioned in SAL (#wikimedia-operations) [2016-09-21T12:09:20Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1094 for an ALTER table - T141951 (duration: 00m 47s)

Change 311971 had a related patch set uploaded (by Marostegui):
db-eqiad.php: Repool db1094 after the ALTER

https://gerrit.wikimedia.org/r/311971

Change 311971 merged by jenkins-bot:
db-eqiad.php: Repool db1094 after the ALTER

https://gerrit.wikimedia.org/r/311971

Mentioned in SAL (#wikimedia-operations) [2016-09-21T12:30:35Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1094 after the ALTER table - T141951 (duration: 00m 47s)

Only the master is pending now:

dbstore1001.eqiad.wmnet - done
dbstore1002.eqiad.wmnet - done
labsdb1001.eqiad.wmnet - done
labsdb1003.eqiad.wmnet - done
db1069.eqiad.wmnet (port 3317 - s7) - done
db1028.eqiad.wmnet - done
db1033.eqiad.wmnet - done 
db1034.eqiad.wmnet - done
db1039.eqiad.wmnet - done
db1062.eqiad.wmnet - done
db1079.eqiad.wmnet - done
db1086.eqiad.wmnet - done
db1094.eqiad.wmnet - done
db1041.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2016-09-22T08:49:16Z] <marostegui> Deploying schema change on S7 master - T141951

This ALTER has been completed in the whole shard (except dbstore2001 which is broken and will be reimaged soon - T146261:

for i in `cat s7.hosts | awk -F " " '{print $1}' | grep -v db1069`; do echo $i; mysql -h$i  centralauth -e "show create table localuser;" | egrep "lu_local_id|lu_global_id" | wc -l;done
dbstore2001.codfw.wmnet
ERROR 1932 (42S02) at line 1: Table 'centralauth.localuser' doesn't exist in engine
0
dbstore2002.codfw.wmnet
1
db2040.codfw.wmnet
1
db2047.codfw.wmnet
1
db2054.codfw.wmnet
1
db2061.codfw.wmnet
1
db2068.codfw.wmnet
1
db2029.codfw.wmnet
1
dbstore1001.eqiad.wmnet
1
dbstore1002.eqiad.wmnet
1
labsdb1001.eqiad.wmnet
1
labsdb1003.eqiad.wmnet
1
db1028.eqiad.wmnet
1
db1033.eqiad.wmnet
1
db1034.eqiad.wmnet
1
db1039.eqiad.wmnet
1
db1062.eqiad.wmnet
1
db1079.eqiad.wmnet
1
db1086.eqiad.wmnet
1
db1094.eqiad.wmnet
1
db1041.eqiad.wmnet
1


root@neodymium:/home/marostegui/git/software/dbtools# mysql -hdb1069 -P3317 centralauth -e "show create table localuser;" | egrep "lu_local_id|lu_global_id" | wc -l
1

@kaldari, @Niharika can you guys let me know if you need something else or this ticket can be closed?
Thanks