Page MenuHomePhabricator

cl_collation index redundant
Closed, ResolvedPublic

Description

mysql wmf db1051 root enwiki> SELECT s.table_name,s.index_name,rows_read FROM information_schema.statistics s LEFT JOIN information_schema. index_statistics  i ON (i.table_schema=s.table_schema AND i.table_name=s.table_name AND i.index_name=s.index_name) WHERE s.table_schema='enwiki' and s.table_name = 'categorylinks' AND  seq_in_index=1;
+---------------+--------------+------------+
| table_name    | index_name   | rows_read  |
+---------------+--------------+------------+
| categorylinks | cl_sortkey   |  780279158 |
| categorylinks | cl_from      | 2135832387 |
| categorylinks | cl_timestamp |  150744319 |
| categorylinks | cl_collation |       NULL | <-- never used
+---------------+--------------+------------+
4 rows in set (0.26 sec)

Same result appeared for various slaves and wikis across all WMF clusters.

cl_collation could be dropped.

Or could it? Does the updateCollation batch job need it?

Details

Reference
bz57184

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:19 AM
bzimport added a project: Wikimedia-Rdbms.
bzimport set Reference to bz57184.
Springle created this task.Nov 18 2013, 2:12 PM
matmarex updated the task description. (Show Details)Feb 7 2015, 10:32 PM
matmarex set Security to None.

@tstarling had said in rMWa43f751cf6b6849e:

Mark the cl_collation index as needing deletion, it was always pretty
pointless. You can't do much better than a full table scan when you're
changing the collation value on a wiki.

That commit did not actually drop the index though.

Krenair added a subscriber: Krenair.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 2 2015, 2:40 PM
This comment was removed by Krenair.
Krenair removed Springle as the assignee of this task.Oct 4 2015, 3:03 AM
Danny_B moved this task from Unsorted to Cleanup on the Schema-change board.May 4 2016, 3:09 AM

Per T130692, the cl_collation_ext index was added to Wikimedia databases for use by updateCollation.php. @jcrespo, has the old cl_collation index been removed everywhere, including on the sites that the new index had already been added to? If so, this task can be closed.

except on one server where I forgot to run this query (and I am running it now) I executed on all of them: "alter table categorylinks DROP INDEX cl_collation, ADD INDEX cl_collation_ext (cl_collation, cl_to, cl_type, cl_from)", with this result:

CREATE TABLE `categorylinks` (
  `cl_from` int(8) unsigned NOT NULL DEFAULT '0',
  `cl_to` varbinary(255) NOT NULL DEFAULT '',
  `cl_sortkey` varbinary(230) NOT NULL DEFAULT '',
  `cl_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `cl_sortkey_prefix` varbinary(255) NOT NULL DEFAULT '',
  `cl_collation` varbinary(32) NOT NULL DEFAULT '',
  `cl_type` enum('page','subcat','file') NOT NULL DEFAULT 'page',
  UNIQUE KEY `cl_from` (`cl_from`,`cl_to`),
  KEY `cl_timestamp` (`cl_to`,`cl_timestamp`),
  KEY `cl_sortkey` (`cl_to`,`cl_type`,`cl_sortkey`,`cl_from`),
  KEY `cl_collation_ext` (`cl_collation`,`cl_to`,`cl_type`,`cl_from`)
) ENGINE=InnoDB DEFAULT CHARSET=binary
matmarex closed this task as Resolved.Jun 2 2016, 7:52 PM
matmarex assigned this task to jcrespo.

Per above.