Page MenuHomePhabricator

logging.log_title_time and logging.log_title_type_time indexes are not on tables.sql but they exist on most of the wikis
Closed, ResolvedPublic

Description

As part of T51199 I found:

These indexes exist on enwiki, commons etc but they are not on tables.sql:

KEY `log_title_time` (`log_title`(16),`log_timestamp`),
KEY `log_title_type_time` (`log_title`(16),`log_type`,`log_timestamp`),

They look to be used, at least they are not being reported as unused on sys.schema_unused_indexes.
The logging table has many drifts from production and tables.sql already T51199: Add index log_type_action, T71127: Discrepancies with logging table on different wikis.

I have created this ticket to have a discussion about those two indexes, my first approach would be to be on the safe side and:

  1. Add those two indexes to tables.sql (as they are existing on most of the wikis already) as part of this ticket
  2. Fix the existing drifts (which have bitten us already) (the existing two tickets mentioned above)

Event Timeline

Marostegui triaged this task as Normal priority.Mar 1 2019, 9:36 AM
Marostegui created this task.

I was going to mention this when you started working on the GLTU (Grand log table unification), there maybe indexes added that may or may not be neccesary, that may have been added because they were needed but are totally undocumented. The best way to test it is to pool a server with the official indexes with low load and check for performance.

Yes, after I created this task I saw they are being reported as used on enwiki, commons but one of them is NOT being used on wikidata makes me think that we might not need them in the end or maybe not both, it is all a bit messed up.
I will report back! :)

Anomie added a comment.Mar 1 2019, 3:20 PM

As I mentioned at T51199#4969053, the log_title(16) at the start of the index seems pretty strange to me. I'd think queries that would use log_title_time would probably work as well or better using page_time (which is (`log_namespace`,`log_title`,`log_timestamp`)), and queries using log_title_type_time would probably work as well or better using an index on (log_namespace, log_title, log_type, log_timestamp). But without knowing what those queries actually are, I can't say for sure.

But if we really want to add them to tables.sql as they exist just to be safe, we could do that:

diff --git a/maintenance/tables.sql b/maintenance/tables.sql
index dac222ffcd..028719aadc 100644
--- a/maintenance/tables.sql
+++ b/maintenance/tables.sql
@@ -1651,6 +1651,9 @@ CREATE INDEX /*i*/log_user_text_type_time ON /*_*/logging (log_user_text, log_ty
 -- Special:Log filter by anonymous performer
 CREATE INDEX /*i*/log_user_text_time ON /*_*/logging (log_user_text, log_timestamp);
+
+-- These are on Wikimedia wikis, and are reported as used, but we have no idea by what (T217397).
+CREATE INDEX /*i*/log_title_time ON /*_*/logging (log_title(16), log_timestamp);
+CREATE INDEX /*i*/log_title_type_time ON /*_*/logging (log_title(16), log_type, log_timestamp);
 
 
 CREATE TABLE /*_*/log_search (
   -- The type of ID (rev ID, log ID, rev timestamp, username)

(plus something appropriate in maintenance/archives/, and similar additions for PostgreSQL, MSSQL, and Oracle).

Thanks @Anomie!
Let's see what the test reveals on a slave in enwiki and commonswiki.
I will alter a main traffic host on enwiki and a rc host on commonswiki (as the table for the rc hosts has a different PK + partitioning), just to be safe.

We also have some queries that will still be slow even once we have left everything matching tables.sql (T71222#4456488)

Change 494171 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/494171

Change 494171 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1089

https://gerrit.wikimedia.org/r/494171

Mentioned in SAL (#wikimedia-operations) [2019-03-04T08:14:00Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Depool db1089 - T217397 (duration: 00m 49s)

Mentioned in SAL (#wikimedia-operations) [2019-03-04T08:29:28Z] <marostegui> Change logging indexes on db1089 to leave the indexes exactly like the ones on tables.sql - T217397

I have run the following on db1089:

root@db1089.eqiad.wmnet[enwiki]> alter table logging drop key log_title_type_time, drop key log_title_time, add key log_type_action (log_type, log_action, log_timestamp), add key log_user_text_type_time (log_user_text, log_type, log_timestamp);
Query OK, 0 rows affected (11 min 38.31 sec)

Its enwiki.logging table looks like this now:

| logging | CREATE TABLE `logging` (
  `log_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `log_type` varbinary(32) NOT NULL DEFAULT '',
  `log_action` varbinary(32) NOT NULL DEFAULT '',
  `log_timestamp` varbinary(14) NOT NULL DEFAULT '19700101000000',
  `log_user` int(10) unsigned NOT NULL DEFAULT '0',
  `log_namespace` int(11) NOT NULL DEFAULT '0',
  `log_title` varbinary(255) NOT NULL DEFAULT '',
  `log_comment` varbinary(255) NOT NULL DEFAULT '',
  `log_comment_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `log_params` blob NOT NULL,
  `log_deleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `log_user_text` varbinary(255) NOT NULL DEFAULT '',
  `log_actor` bigint(20) unsigned NOT NULL DEFAULT '0',
  `log_page` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`log_id`),
  KEY `type_time` (`log_type`,`log_timestamp`),
  KEY `user_time` (`log_user`,`log_timestamp`),
  KEY `page_time` (`log_namespace`,`log_title`,`log_timestamp`),
  KEY `times` (`log_timestamp`),
  KEY `log_page_id_time` (`log_page`,`log_timestamp`),
  KEY `log_user_text_time` (`log_user_text`(16),`log_timestamp`),
  KEY `log_user_type_time` (`log_user`,`log_type`,`log_timestamp`),
  KEY `actor_time` (`log_actor`,`log_timestamp`),
  KEY `log_actor_type_time` (`log_actor`,`log_type`,`log_timestamp`),
  KEY `log_type_action` (`log_type`,`log_action`,`log_timestamp`),
  KEY `log_user_text_type_time` (`log_user_text`,`log_type`,`log_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=97452231 DEFAULT CHARSET=binary

Which is exactly how the table looks like on tables.sql.
I am going to repool it with low weight and monitor its performance and check if there are slow queries showing up for this host on tendril.

Change 494184 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Repool db1089 with low weight

https://gerrit.wikimedia.org/r/494184

Change 494184 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Repool db1089 with low weight

https://gerrit.wikimedia.org/r/494184

Change 494221 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1097:3314,db1100

https://gerrit.wikimedia.org/r/494221

Change 494221 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1097:3314,db1100

https://gerrit.wikimedia.org/r/494221

Mentioned in SAL (#wikimedia-operations) [2019-03-04T14:20:01Z] <marostegui> Change indexes on logging table on db1100 (s5) and db1097:3314 (commonswiki) - T217397

I have also altered db1100 (s5, containing dewiki) so its logging table looks like the one on tables.sql:

mysql.py -hdb1100 dewiki -e "show create table logging\G"
*************************** 1. row ***************************
       Table: logging
Create Table: CREATE TABLE `logging` (
  `log_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `log_type` varbinary(32) NOT NULL DEFAULT '',
  `log_action` varbinary(32) NOT NULL DEFAULT '',
  `log_timestamp` varbinary(14) NOT NULL DEFAULT '19700101000000',
  `log_user` int(10) unsigned NOT NULL DEFAULT '0',
  `log_namespace` int(11) NOT NULL DEFAULT '0',
  `log_title` varbinary(255) NOT NULL DEFAULT '',
  `log_comment` varbinary(255) NOT NULL DEFAULT '',
  `log_comment_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `log_params` blob NOT NULL,
  `log_deleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `log_user_text` varbinary(255) NOT NULL DEFAULT '',
  `log_actor` bigint(20) unsigned NOT NULL DEFAULT '0',
  `log_page` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`log_id`),
  KEY `type_time` (`log_type`,`log_timestamp`),
  KEY `user_time` (`log_user`,`log_timestamp`),
  KEY `page_time` (`log_namespace`,`log_title`,`log_timestamp`),
  KEY `times` (`log_timestamp`),
  KEY `log_page_id_time` (`log_page`,`log_timestamp`),
  KEY `log_user_text_time` (`log_user_text`(16),`log_timestamp`),
  KEY `log_user_type_time` (`log_user`,`log_type`,`log_timestamp`),
  KEY `actor_time` (`log_actor`,`log_timestamp`),
  KEY `log_actor_type_time` (`log_actor`,`log_type`,`log_timestamp`),
  KEY `log_type_action` (`log_type`,`log_action`,`log_timestamp`),
  KEY `log_user_text_type_time` (`log_user_text`,`log_type`,`log_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=115073236 DEFAULT CHARSET=binary

Same for db1097:3314 which is a special slave (partitioned) on commonswiki:

mysql.py -hdb1097:3314 commonswiki -e "show create table logging\G"
*************************** 1. row ***************************
       Table: logging
Create Table: CREATE TABLE `logging` (
  `log_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `log_type` varbinary(32) NOT NULL DEFAULT '',
  `log_action` varbinary(32) NOT NULL DEFAULT '',
  `log_timestamp` varbinary(14) NOT NULL DEFAULT '19700101000000',
  `log_user` int(10) unsigned NOT NULL DEFAULT '0',
  `log_namespace` int(11) NOT NULL DEFAULT '0',
  `log_title` varbinary(255) NOT NULL DEFAULT '',
  `log_comment` varbinary(255) NOT NULL DEFAULT '',
  `log_comment_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `log_params` blob NOT NULL,
  `log_deleted` tinyint(3) unsigned NOT NULL DEFAULT '0',
  `log_user_text` varbinary(255) NOT NULL DEFAULT '',
  `log_actor` bigint(20) unsigned NOT NULL DEFAULT '0',
  `log_page` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`log_id`,`log_user`),
  KEY `type_time` (`log_type`,`log_timestamp`),
  KEY `user_time` (`log_user`,`log_timestamp`),
  KEY `page_time` (`log_namespace`,`log_title`,`log_timestamp`),
  KEY `times` (`log_timestamp`),
  KEY `log_page_id_time` (`log_page`,`log_timestamp`),
  KEY `log_user_text_time` (`log_user_text`(16),`log_timestamp`),
  KEY `log_user_type_time` (`log_user`,`log_type`,`log_timestamp`),
  KEY `actor_time` (`log_actor`,`log_timestamp`),
  KEY `log_actor_type_time` (`log_actor`,`log_type`,`log_timestamp`),
  KEY `log_type_action` (`log_type`,`log_action`,`log_timestamp`),
  KEY `log_user_text_type_time` (`log_user_text`,`log_type`,`log_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=279593245 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8
/*!50100 PARTITION BY RANGE (log_user)
(PARTITION p1 VALUES LESS THAN (1) ENGINE = InnoDB,
 PARTITION p5000 VALUES LESS THAN (5000) ENGINE = InnoDB,
 PARTITION p10000 VALUES LESS THAN (10000) ENGINE = InnoDB,
 PARTITION p20000 VALUES LESS THAN (20000) ENGINE = InnoDB,
 PARTITION p30000 VALUES LESS THAN (30000) ENGINE = InnoDB,
 PARTITION p40000 VALUES LESS THAN (40000) ENGINE = InnoDB,
 PARTITION p50000 VALUES LESS THAN (50000) ENGINE = InnoDB,
 PARTITION p75000 VALUES LESS THAN (75000) ENGINE = InnoDB,
 PARTITION p100000 VALUES LESS THAN (100000) ENGINE = InnoDB,
 PARTITION p150000 VALUES LESS THAN (150000) ENGINE = InnoDB,
 PARTITION p200000 VALUES LESS THAN (200000) ENGINE = InnoDB,
 PARTITION p250000 VALUES LESS THAN (250000) ENGINE = InnoDB,
 PARTITION p300000 VALUES LESS THAN (300000) ENGINE = InnoDB,
 PARTITION p350000 VALUES LESS THAN (350000) ENGINE = InnoDB,
 PARTITION p400000 VALUES LESS THAN (400000) ENGINE = InnoDB,
 PARTITION p500000 VALUES LESS THAN (500000) ENGINE = InnoDB,
 PARTITION p750000 VALUES LESS THAN (750000) ENGINE = InnoDB,
 PARTITION p1000000 VALUES LESS THAN (1000000) ENGINE = InnoDB,
 PARTITION p1500000 VALUES LESS THAN (1500000) ENGINE = InnoDB,
 PARTITION p2000000 VALUES LESS THAN (2000000) ENGINE = InnoDB,
 PARTITION p3000000 VALUES LESS THAN (3000000) ENGINE = InnoDB,
 PARTITION p4000000 VALUES LESS THAN (4000000) ENGINE = InnoDB,
 PARTITION p5000000 VALUES LESS THAN (5000000) ENGINE = InnoDB,
 PARTITION pMAXVALUE VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */

So now we have 3 hosts we can monitor:
db1089 - enwiki
db1100 (s5, which has dewiki)
db1097:3314 (special partitioned slave on commonswiki)

Anomie added a comment.Mar 4 2019, 3:37 PM
KEY `log_user_text_time` (`log_user_text`(16),`log_timestamp`),

Note tables.sql doesn't have a prefix for the log_user_text field, it has just (log_user_text, log_timestamp). But that may not really matter for this experiment.

A query using this index would probably look something like SELECT * FROM logging WHERE log_user_text = '...' ORDER BY log_timestamp LIMIT 51;.

Chances are this type of query is only used when there's no user ID to compare with log_user, which generally means log_user_text will be an IP address. The prefixed index was probably added before anyone cared about IPv6, as 16 characters is enough to cover any IPv4 address stored in log_user_text. When used with an IPv6 address (or a long interwiki username from imports or the like) it'll have to fetch all matching rows, filter to just the ones with the right log_user_text and filesort. On the other hand, on enwiki the IPv6 address with the most log entries has only 183, so even if it does have to fetch them all and filesort it wouldn't be any kind of a problem now.

KEY `log_user_text_time` (`log_user_text`(16),`log_timestamp`),

Note tables.sql doesn't have a prefix for the log_user_text field, it has just (log_user_text, log_timestamp). But that may not really matter for this experiment.

Yep, I noticed after the whole alter was done, and decided to leave it. However, fixed it on my notes so if we finally see this is fine, I will drop+recreate that index without the prefix so it will look like tables.sql one - thanks!

So far none of those 3 hosts have shown any slow query (compared to the other non altered hosts), so for now, looks like the indexes on tables.sql are fine.
I will keep monitoring till the end of the week.

If that goes fine, I am going to alter more hosts on s5 and s1 on eqiad to keep having a bit more chances to spot something bad.

Marostegui added a comment.EditedMar 8 2019, 8:15 AM

Nothing else reported, so I am going to alter some more hosts to see if we get anything:

s1:
db1080
db1099:3311 (special - partitioned)

s5:
db1110
db1096:3315 (special - partitioned)

Update: the above hosts were altered

Those hosts will be added to the existing altered ones:
s1:
db1089 - enwiki
s5:
db1100 (s5, which has dewiki)
s4:
db1097:3314 (special partitioned slave on commonswiki)

Marostegui closed this task as Resolved.Mar 11 2019, 8:22 AM

So, for a week I have seen no new slow queries show up on those hosts which have the logging table indexes exactly as reflected on tables.sql, so I am going to close this and the result is that there is no need to include the following indexes on tables.sql:

KEY `log_title_time` (`log_title`(16),`log_timestamp`),
KEY `log_title_type_time` (`log_title`(16),`log_type`,`log_timestamp`),

Instead, I am going to unify the table across the servers, as its discrepancies have already bitten us.
Thanks everyone, specially @Anomie!