Page MenuHomePhabricator

Change the abusefilter database index definition
Open, HighPublic

Description

Currently, it's like this:

CREATE TABLE /*$wgDBprefix*/abuse_filter_action (
	afa_filter BIGINT unsigned NOT NULL,
	afa_consequence varchar(255) NOT NULL,
	afa_parameters TINYBLOB NOT NULL,

	PRIMARY KEY (afa_filter,afa_consequence),
	KEY (afa_consequence)
) /*$wgDBTableOptions*/;

The PK and indexes are defined inside CREATE TABLE statement. This is explicitly against coding convention of database ("Create indices as separate statements, do not include them in the table creation query; the separate syntax is clearer and makes it easier to see the difference between unique and non-unique indices. Don't create indices with ALTER TABLE ... ADD INDEX ..., always use CREATE INDEX ... ON ... instead. ") and has been causing lots of issues:

  • It doesn't work in Sqlite, so the extension has a separate file just for sqlite meaning more DB schema to maintain
  • Not having explicit names for the indexes is making checking the drifts pretty hard (T104459: Detect object, schema and data drifts between mediawiki HEAD, production masters and replicas)
  • The name MySQL is automatically giving to the indexes is different from the name sqlite uses causing the drift report (that's based on the Sqlite schema) have more than 1500 drifts. To compare, core with many many more tables has half of these drifts

Event Timeline

Reedy renamed this task from Change the absuerfilter database index defition to Change the abusefilter database index definition.May 1 2020, 4:16 PM

IIRC, Jaime told me something about this a few months ago, but I forgot to follow up on that. I'll do it now.

Change 593906 had a related patch set uploaded (by Daimona Eaytoy; owner: Daimona Eaytoy):
[mediawiki/extensions/AbuseFilter@master] Give MySQL indexes explicit names, create them outside of CREATE TABLE

https://gerrit.wikimedia.org/r/593906

I think the patch above should be enough for indexes. Merging the MySQL schema with SQLite's is not currently possible, I think, because of incompatible types. In fact, the SQLite schema file used to be a copy of the MySQL file, until someone noticed that it didn't work at all: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/432963/ (this is also the "example 2" link in the task description).

I think the patch above should be enough for indexes. Merging the MySQL schema with SQLite's is not currently possible, I think, because of incompatible types. In fact, the SQLite schema file used to be a copy of the MySQL file, until someone noticed that it didn't work at all: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/432963/ (this is also the "example 2" link in the task description).

I wonder why we can't just delete the sqlite specific file and rely on the MW SQL munging... It works fine for MW core etc

I think the patch above should be enough for indexes. Merging the MySQL schema with SQLite's is not currently possible, I think, because of incompatible types. In fact, the SQLite schema file used to be a copy of the MySQL file, until someone noticed that it didn't work at all: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/432963/ (this is also the "example 2" link in the task description).

I wonder why we can't just delete the sqlite specific file and rely on the MW SQL munging... It works fine for MW core etc

If it works, I'd be happy to do that. Perhaps it didn't work when the SQLite issues were reported?

I think the patch above should be enough for indexes. Merging the MySQL schema with SQLite's is not currently possible, I think, because of incompatible types. In fact, the SQLite schema file used to be a copy of the MySQL file, until someone noticed that it didn't work at all: https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/432963/ (this is also the "example 2" link in the task description).

I wonder why we can't just delete the sqlite specific file and rely on the MW SQL munging... It works fine for MW core etc

If it works, I'd be happy to do that. Perhaps it didn't work when the SQLite issues were reported?

Potentially... Should be easy enough to test :)

Potentially... Should be easy enough to test :)

Hmmm I don't have SQLite installed, and too lazy to install it right now. Do we have any CI job running on SQLite? I seem to recall we had at least one...

There's a CI test in core which gets ran in gate-and-submit but I don't know if it's being run in Abusefilter

Change 593912 had a related patch set uploaded (by Reedy; owner: Reedy):
[mediawiki/extensions/AbuseFilter@master] Remove SQLite specific files

https://gerrit.wikimedia.org/r/593912

^ One way to find out. Though, that really needs to be ontop of Daimona's patch to have any chance of passing... :D

It looks like db_patches/patch-fix-indexes.sql is doing the reverse of the request of this task

It looks like db_patches/patch-fix-indexes.sql is doing the reverse of the request of this task

I think it actually tried to do the same as this request, at least for MySQL.

It looks like db_patches/patch-fix-indexes.sql is doing the reverse of the request of this task

I think it actually tried to do the same as this request, at least for MySQL.

It's making all the indexes drop the afl_ prefixes...

It looks like db_patches/patch-fix-indexes.sql is doing the reverse of the request of this task

I think it actually tried to do the same as this request, at least for MySQL.

It's making all the indexes drop the afl_ prefixes...

Ah, got your point. While that's true, the intent of the patch was to use explicit names. It did miss the 'afl_' prefix, but I guess that's something which was decided afterwards.

For reference.... enwiki!

MariaDB [enwiki]> show create table abuse_filter\G
*************************** 1. row ***************************
       Table: abuse_filter
Create Table: CREATE TABLE `abuse_filter` (
  `af_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `af_pattern` blob NOT NULL,
  `af_user` bigint(20) unsigned NOT NULL DEFAULT '0',
  `af_user_text` varbinary(255) NOT NULL DEFAULT '',
  `af_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `af_enabled` tinyint(1) NOT NULL DEFAULT '1',
  `af_comments` blob,
  `af_public_comments` tinyblob,
  `af_hidden` tinyint(1) NOT NULL DEFAULT '0',
  `af_hit_count` bigint(20) NOT NULL DEFAULT '0',
  `af_throttled` tinyint(1) NOT NULL DEFAULT '0',
  `af_deleted` tinyint(1) NOT NULL DEFAULT '0',
  `af_actions` varbinary(255) NOT NULL DEFAULT '',
  `af_global` tinyint(1) NOT NULL DEFAULT '0',
  `af_group` varbinary(64) NOT NULL DEFAULT 'default',
  PRIMARY KEY (`af_id`),
  KEY `af_user` (`af_user`),
  KEY `af_group` (`af_group`,`af_enabled`,`af_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1054 DEFAULT CHARSET=binary
1 row in set (0.00 sec)

MariaDB [enwiki]> show create table abuse_filter_action\G
*************************** 1. row ***************************
       Table: abuse_filter_action
Create Table: CREATE TABLE `abuse_filter_action` (
  `afa_filter` bigint(20) unsigned NOT NULL DEFAULT '0',
  `afa_consequence` varbinary(255) NOT NULL DEFAULT '',
  `afa_parameters` tinyblob NOT NULL,
  PRIMARY KEY (`afa_filter`,`afa_consequence`),
  KEY `afa_consequence` (`afa_consequence`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
1 row in set (0.00 sec)

MariaDB [enwiki]> show create table abuse_filter_history\G
*************************** 1. row ***************************
       Table: abuse_filter_history
Create Table: CREATE TABLE `abuse_filter_history` (
  `afh_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `afh_filter` bigint(20) unsigned NOT NULL DEFAULT '0',
  `afh_user` bigint(20) unsigned NOT NULL DEFAULT '0',
  `afh_user_text` varbinary(255) NOT NULL DEFAULT '',
  `afh_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `afh_pattern` blob NOT NULL,
  `afh_comments` blob NOT NULL,
  `afh_flags` tinyblob NOT NULL,
  `afh_public_comments` tinyblob,
  `afh_actions` blob,
  `afh_deleted` tinyint(1) NOT NULL DEFAULT '0',
  `afh_changed_fields` varbinary(255) NOT NULL DEFAULT '',
  `afh_group` varbinary(64) DEFAULT NULL,
  PRIMARY KEY (`afh_id`),
  KEY `afh_filter` (`afh_filter`),
  KEY `afh_user` (`afh_user`),
  KEY `afh_user_text` (`afh_user_text`),
  KEY `afh_timestamp` (`afh_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=23438 DEFAULT CHARSET=binary
1 row in set (0.00 sec)

MariaDB [enwiki]> show create table abuse_filter_log\G
*************************** 1. row ***************************
       Table: abuse_filter_log
Create Table: CREATE TABLE `abuse_filter_log` (
  `afl_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `afl_filter` varbinary(64) NOT NULL DEFAULT '',
  `afl_global` tinyint(1) NOT NULL DEFAULT '0',
  `afl_filter_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `afl_user` bigint(20) unsigned NOT NULL DEFAULT '0',
  `afl_user_text` varbinary(255) NOT NULL DEFAULT '',
  `afl_ip` varbinary(255) NOT NULL DEFAULT '',
  `afl_action` varbinary(255) NOT NULL DEFAULT '',
  `afl_actions` varbinary(255) NOT NULL DEFAULT '',
  `afl_var_dump` blob NOT NULL,
  `afl_timestamp` varbinary(14) NOT NULL DEFAULT '',
  `afl_namespace` int(11) NOT NULL,
  `afl_title` varbinary(255) NOT NULL DEFAULT '',
  `afl_wiki` varbinary(64) DEFAULT NULL,
  `afl_deleted` tinyint(1) NOT NULL DEFAULT '0',
  `afl_patrolled_by` int(10) unsigned NOT NULL DEFAULT '0',
  `afl_rev_id` int(10) unsigned DEFAULT NULL,
  PRIMARY KEY (`afl_id`),
  KEY `afl_timestamp` (`afl_timestamp`),
  KEY `afl_rev_id` (`afl_rev_id`),
  KEY `user_timestamp` (`afl_user`,`afl_user_text`,`afl_timestamp`),
  KEY `filter_timestamp` (`afl_filter`,`afl_timestamp`),
  KEY `page_timestamp` (`afl_namespace`,`afl_title`,`afl_timestamp`),
  KEY `ip_timestamp` (`afl_ip`,`afl_timestamp`),
  KEY `wiki_timestamp` (`afl_wiki`,`afl_timestamp`),
  KEY `filter_timestamp_full` (`afl_global`,`afl_filter_id`,`afl_timestamp`)
) ENGINE=InnoDB AUTO_INCREMENT=26639128 DEFAULT CHARSET=binary
1 row in set (0.00 sec)

MariaDB [enwiki]> show indexes from abuse_filter;
+--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table        | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abuse_filter |          0 | PRIMARY  |            1 | af_id       | A         |         965 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter |          1 | af_user  |            1 | af_user     | A         |         193 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter |          1 | af_group |            1 | af_group    | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter |          1 | af_group |            2 | af_enabled  | A         |           6 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter |          1 | af_group |            3 | af_id       | A         |         965 |     NULL | NULL   |      | BTREE      |         |               |
+--------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
5 rows in set (0.00 sec)

MariaDB [enwiki]> show indexes from abuse_filter_action;
+---------------------+------------+-----------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table               | Non_unique | Key_name        | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------------------+------------+-----------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abuse_filter_action |          0 | PRIMARY         |            1 | afa_filter      | A         |         695 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_action |          0 | PRIMARY         |            2 | afa_consequence | A         |         695 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_action |          1 | afa_consequence |            1 | afa_consequence | A         |          10 |     NULL | NULL   |      | BTREE      |         |               |
+---------------------+------------+-----------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)

MariaDB [enwiki]> show indexes from abuse_filter_history;
+----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table                | Non_unique | Key_name      | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abuse_filter_history |          0 | PRIMARY       |            1 | afh_id        | A         |       15900 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_history |          1 | afh_filter    |            1 | afh_filter    | A         |        1766 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_history |          1 | afh_user      |            1 | afh_user      | A         |         378 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_history |          1 | afh_user_text |            1 | afh_user_text | A         |         387 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_history |          1 | afh_timestamp |            1 | afh_timestamp | A         |       15900 |     NULL | NULL   |      | BTREE      |         |               |
+----------------------+------------+---------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
5 rows in set (0.00 sec)

MariaDB [enwiki]> show indexes from abuse_filter_log;
+------------------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table            | Non_unique | Key_name              | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| abuse_filter_log |          0 | PRIMARY               |            1 | afl_id        | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | afl_timestamp         |            1 | afl_timestamp | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | afl_rev_id            |            1 | afl_rev_id    | A         |    13247148 |     NULL | NULL   | YES  | BTREE      |         |               |
| abuse_filter_log |          1 | user_timestamp        |            1 | afl_user      | A         |     5298859 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | user_timestamp        |            2 | afl_user_text | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | user_timestamp        |            3 | afl_timestamp | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | filter_timestamp      |            1 | afl_filter    | A         |       93289 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | filter_timestamp      |            2 | afl_timestamp | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | page_timestamp        |            1 | afl_namespace | A         |        7644 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | page_timestamp        |            2 | afl_title     | A         |     8831432 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | page_timestamp        |            3 | afl_timestamp | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | ip_timestamp          |            1 | afl_ip        | A         |     1655893 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | ip_timestamp          |            2 | afl_timestamp | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | wiki_timestamp        |            1 | afl_wiki      | A         |           2 |     NULL | NULL   | YES  | BTREE      |         |               |
| abuse_filter_log |          1 | wiki_timestamp        |            2 | afl_timestamp | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | filter_timestamp_full |            1 | afl_global    | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | filter_timestamp_full |            2 | afl_filter_id | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| abuse_filter_log |          1 | filter_timestamp_full |            3 | afl_timestamp | A         |    26494296 |     NULL | NULL   |      | BTREE      |         |               |
+------------------+------------+-----------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
18 rows in set (0.01 sec)

MariaDB [enwiki]>

My abuse filter tables as is:

Applying https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/AbuseFilter/+/593906/ and I get this:

...abuse_filter table already exists.
...abuse_filter_history table already exists.
...have afh_changed_fields field in abuse_filter_history table.
...have af_deleted field in abuse_filter table.
...have af_actions field in abuse_filter table.
...have af_global field in abuse_filter table.
...have afl_rev_id field in abuse_filter_log table.
...have af_group field in abuse_filter table.
Adding index afl_wiki_timestamp to table abuse_filter_log ...done.
...afl_namespace in table abuse_filter_log already modified by patch /var/www/wiki/mediawiki/extensions/AbuseFilter/db_patches/patch-afl-namespace_int.sql.
...abuse_filter_log table does not contain afl_log_id field.
Adding index afl_filter_timestamp_full to table abuse_filter_log ...Wikimedia\Rdbms\DBQueryError from line 1679 of /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/Database.php: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading? 
Query: ALTER TABLE `mw_abuse_filter_log`
 ADD COLUMN afl_global tinyint(1) NOT NULL DEFAULT 0 AFTER afl_filter,
 ADD COLUMN afl_filter_id BIGINT unsigned NOT NULL DEFAULT 0 AFTER afl_global,
 ALTER COLUMN afl_filter SET DEFAULT ''

Function: Wikimedia\Rdbms\Database::sourceFile( /var/www/wiki/mediawiki/extensions/AbuseFilter/db_patches/patch-split-afl_filter.sql )
Error: 1060 Duplicate column name 'afl_global' (10.13.37.212:3306)

#0 /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/Database.php(1663): Wikimedia\Rdbms\Database->getQueryException('Duplicate colum...', 1060, 'ALTER TABLE `mw...', 'Wikimedia\\Rdbms...')
#1 /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/Database.php(1640): Wikimedia\Rdbms\Database->getQueryExceptionAndLog('Duplicate colum...', 1060, 'ALTER TABLE `mw...', 'Wikimedia\\Rdbms...')
#2 /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/Database.php(1215): Wikimedia\Rdbms\Database->reportQueryError('Duplicate colum...', 1060, 'ALTER TABLE `mw...', 'Wikimedia\\Rdbms...', false)
#3 /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/Database.php(4812): Wikimedia\Rdbms\Database->query('ALTER TABLE `mw...', 'Wikimedia\\Rdbms...')
#4 /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/Database.php(4747): Wikimedia\Rdbms\Database->sourceStream(Resource id #1517, NULL, NULL, 'Wikimedia\\Rdbms...', NULL)
#5 /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/DBConnRef.php(68): Wikimedia\Rdbms\Database->sourceFile('/var/www/wiki/m...')
#6 /var/www/wiki/mediawiki/core/includes/libs/rdbms/database/MaintainableDBConnRef.php(35): Wikimedia\Rdbms\DBConnRef->__call('sourceFile', Array)
#7 /var/www/wiki/mediawiki/core/includes/installer/DatabaseUpdater.php(723): Wikimedia\Rdbms\MaintainableDBConnRef->sourceFile('/var/www/wiki/m...')
#8 /var/www/wiki/mediawiki/core/includes/installer/DatabaseUpdater.php(825): DatabaseUpdater->applyPatch('/var/www/wiki/m...', true, 'Adding index af...')
#9 /var/www/wiki/mediawiki/core/includes/installer/DatabaseUpdater.php(509): DatabaseUpdater->addIndex('abuse_filter_lo...', 'afl_filter_time...', '/var/www/wiki/m...', true)
#10 /var/www/wiki/mediawiki/core/includes/installer/DatabaseUpdater.php(477): DatabaseUpdater->runUpdates(Array, true)
#11 /var/www/wiki/mediawiki/core/maintenance/update.php(179): DatabaseUpdater->doUpdates(Array)
#12 /var/www/wiki/mediawiki/core/maintenance/doMaintenance.php(105): UpdateMediaWiki->execute()
#13 /var/www/wiki/mediawiki/core/maintenance/update.php(252): require_once('/var/www/wiki/m...')
#14 {main}

Change 593912 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Remove some SQLite specific files

https://gerrit.wikimedia.org/r/593912

Change 593912 merged by jenkins-bot:
[mediawiki/extensions/AbuseFilter@master] Remove some SQLite specific files

https://gerrit.wikimedia.org/r/593912

@Daimona
patch-drop-wiki_timestamp-index.sql removes the wiki_timestamp index from the abuse_filter_log table, but abusefilter.tables.sql still includes that index and doesn't include afl_wiki_timestamp

From recent (unrelated) patchdemo logs:

Creating abuse_filter table ...done.
...abuse_filter_history table already exists.
...[...]
...have af_group field in abuse_filter table.
Dropping wiki_timestamp index from table abuse_filter_log ...done.
Adding index afl_wiki_timestamp to table abuse_filter_log ...done.

those last two shouldn't be needed, right? They should be part of the original table?

Crap... The commit message of my patch says

For MySQL it was renamed in Ic1252efe9f96743d9402fa31a7b2dca1f57ff6ae

but this was only true for PS2, as the rename was removed in PS3. So that patch has removed an index from the table without replacement. First, I'm going to immediately revert that patch on the 1.35 branch. Second, on master this doesn't have to be reverted, as it can be fixed with https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/593906

Second, on master this doesn't have to be reverted, as it can be fixed with https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/593906

Correction, r593906 already deletes renames the wiki_timestamp index together with others, so it can (and should) be reverted on master, too.

Change 593906 merged by jenkins-bot:

[mediawiki/extensions/AbuseFilter@master] Give MySQL indexes explicit names, align MySQL and SQLite

https://gerrit.wikimedia.org/r/593906