Drop now redundant indexes from pagelinks and templatelinks
Open, Needs TriagePublic

Description

So... After 1.30.0-wmf.17 is everywhere, our DBA can drop the other two primary indexes that were blocked on a code change

ALTER TABLE /*_*/pagelinks DROP INDEX /*i*/pl_from;
ALTER TABLE /*_*/templatelinks DROP INDEX /*i*/tl_from;
jcrespo created this task.Tue, Aug 29, 9:39 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptTue, Aug 29, 9:39 PM
jcrespo moved this task from Triage to Next on the DBA board.Tue, Aug 29, 10:13 PM

Change 375953 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/375953

Change 375953 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/375953

Mentioned in SAL (#wikimedia-operations) [2017-09-05T06:09:39Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1089 - T174509 (duration: 00m 55s)

I just realised that db1089 (enwiki) does not have those indexes.
db1083 does.

The table size differences are as follows:

no indexes:

root@db1089:/srv/sqldata/enwiki# ls -lh templatelinks.ibd pagelinks.ibd
-rw-rw---- 1 mysql mysql 209G Sep  5 06:21 pagelinks.ibd
-rw-rw---- 1 mysql mysql 125G Sep  5 06:21 templatelinks.ibd

Indexes:

root@db1083:/srv/sqldata/enwiki#  ls -lh templatelinks.ibd pagelinks.ibd
-rw-rw---- 1 mysql mysql 255G Sep  5 06:21 pagelinks.ibd
-rw-rw---- 1 mysql mysql 147G Sep  5 06:21 templatelinks.ibd

I am going to remove the indexes from db1083 and check the sizes without defragmenting and then, defragment - also this operation is supposed to be online. I will depool db1083 anyways, but will check if it lags (it should not),

Mentioned in SAL (#wikimedia-operations) [2017-09-05T06:25:25Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1089 - T174509 (duration: 00m 46s)

Change 375958 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1083

https://gerrit.wikimedia.org/r/375958

Change 375958 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1083

https://gerrit.wikimedia.org/r/375958

Mentioned in SAL (#wikimedia-operations) [2017-09-05T06:29:54Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Depool db1083 - T174509 (duration: 00m 46s)

Mentioned in SAL (#wikimedia-operations) [2017-09-05T06:32:07Z] <marostegui> Deploy alter table on db1083 - T174509

The ALTER itself takes seconds to run (less than 10 seconds on a SSD host).

root@PRODUCTION s1 slave[enwiki]> show create table pagelinks\G show create table templatelinks\G
*************************** 1. row ***************************
       Table: pagelinks
Create Table: CREATE TABLE `pagelinks` (
  `pl_from` int(8) unsigned NOT NULL DEFAULT '0',
  `pl_namespace` int(11) NOT NULL DEFAULT '0',
  `pl_title` varbinary(255) NOT NULL DEFAULT '',
  `pl_from_namespace` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`pl_from`,`pl_namespace`,`pl_title`),
  KEY `pl_namespace` (`pl_namespace`,`pl_title`,`pl_from`),
  KEY `pl_backlinks_namespace` (`pl_from_namespace`,`pl_namespace`,`pl_title`,`pl_from`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
1 row in set (0.00 sec)

*************************** 1. row ***************************
       Table: templatelinks
Create Table: CREATE TABLE `templatelinks` (
  `tl_from` int(8) unsigned NOT NULL DEFAULT '0',
  `tl_namespace` int(11) NOT NULL DEFAULT '0',
  `tl_title` varbinary(255) NOT NULL DEFAULT '',
  `tl_from_namespace` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`tl_from`,`tl_namespace`,`tl_title`),
  KEY `tl_namespace` (`tl_namespace`,`tl_title`,`tl_from`),
  KEY `tl_backlinks_namespace` (`tl_from_namespace`,`tl_namespace`,`tl_title`,`tl_from`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
1 row in set (0.00 sec)

However the space isn't freed up, so we do need to defragment:

root@db1083:/srv/sqldata/enwiki# ls -lh templatelinks.ibd pagelinks.ibd
-rw-rw---- 1 mysql mysql 255G Sep  5 06:33 pagelinks.ibd
-rw-rw---- 1 mysql mysql 147G Sep  5 06:33 templatelinks.ibd

I am defragmenting now. Will report back

This is template links after the defragmentation:

root@db1083:/srv/sqldata/enwiki# ls -lh templatelinks.ibd
-rw-rw---- 1 mysql mysql 85G Sep  5 11:33 templatelinks.ibd

Almost half the size.

Almost half the size.

Probably 30 GB comes from the index like in the above comparions, the rest from degragmenting/compression, if any (it is normal to gain a lot now and after a while it cancels itself a bit). However, in any case it is probably worth it doing it on all hosts, with low priority.

Maybe we should do a quick "index deletion" online and then track here the defragmention?

Maybe we should do a quick "index deletion" online and then track here the defragmention?

You mean drop the indexes quickly and leave the defragmentation for later?

You mean drop the indexes quickly and leave the defragmentation for later?

Yeah, thinking of making the schema compatible ASAP in a logical way, without depooling (so that if there is some kind of reformatting, or logical dumps, or anything, it is already done), but doing the slow and dangerous part later. Of course, all of that after we have pooled and tested nothing is broken.

Marostegui added a comment.EditedTue, Sep 5, 11:44 AM

Sounds good to me. I will finish db1083 as the defgramentation is already started on pagelinks.

Let's leave db1083 and db1089 for a few days and then continue.
Although, db1089 does not have the indexes there when I checked, so I assume it has been like that for weeks - so if nothing broke for weeks it is probably safe to go ahead and drop it from everywhere.
But let's wait for db1083 for a few days.

Actually, we could also optimize the table without depooling as it is an INPLACE operation (and I actually remember optimizing a big table not long ago without any lag): https://dev.mysql.com/doc/refman/5.6/en/innodb-create-index-overview.html
We can double check anyways.

And pagelinks:

root@db1083:/srv/sqldata/enwiki# ls -lh pagelinks.ibd
-rw-rw---- 1 mysql mysql 146G Sep  5 15:29 pagelinks.ibd

Definitely worth defragmenting

Mentioned in SAL (#wikimedia-operations) [2017-09-05T15:35:19Z] <marostegui@tin> Synchronized wmf-config/db-eqiad.php: Repool db1083 - T174509 (duration: 00m 47s)