Page MenuHomePhabricator

page_restrictions indexes have been majestically drifting from code
Closed, ResolvedPublic

Description

The report gives a long list of drifts (raw report below).

A db in master has indexes like this (my localhost):

MariaDB [client]> show indexes from page_restrictions;
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table             | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| page_restrictions |          0 | PRIMARY      |            1 | pr_id       | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | pr_pagetype  |            1 | pr_page     | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | pr_pagetype  |            2 | pr_type     | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            1 | pr_type     | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            2 | pr_level    | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_level     |            1 | pr_level    | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_cascade   |            1 | pr_cascade  | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
7 rows in set (0.001 sec)

In production, it's like this:

wikiadmin@10.64.0.214(enwiki)> show indexes from page_restrictions;
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table             | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| page_restrictions |          0 | PRIMARY      |            1 | pr_page     | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | PRIMARY      |            2 | pr_type     | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | pr_id        |            1 | pr_id       | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_page      |            1 | pr_page     | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            1 | pr_type     | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            2 | pr_level    | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_level     |            1 | pr_level    | A         |           6 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_cascade   |            1 | pr_cascade  | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
8 rows in set (0.00 sec)

(It's not like this everywhere, s8 doesn't have this issue)

From what I understand it's the issue of PK not being pr_id but being pr_page, pr_type instead (and pr_page,pr_type should have its own unique index named pr_pagetype)


The raw report:

	    "page_restrictions pr_id index-mismatch-prod-extra": {
	        "s1": [
	            "db1083.eqiad.wmnet",
	            "db1106.eqiad.wmnet",
	            "db1105.eqiad.wmnet",
	            "db1119.eqiad.wmnet",
	            "db1089.eqiad.wmnet",
	            "db1107.eqiad.wmnet",
	            "db1099.eqiad.wmnet",
	            "db1118.eqiad.wmnet",
	            "db1134.eqiad.wmnet",
	            "db1091.eqiad.wmnet"
	        ],
	        "s2": [
	            "db1122.eqiad.wmnet",
	            "db1129.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1074.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1076.eqiad.wmnet",
	            "db1105.eqiad.wmnet"
	        ],
	        "s3": [
	            "db1123.eqiad.wmnet",
	            "db1078.eqiad.wmnet",
	            "db1112.eqiad.wmnet",
	            "db1075.eqiad.wmnet"
	        ],
	        "s4": [
	            "db1081.eqiad.wmnet",
	            "db1084.eqiad.wmnet",
	            "db1141.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1121.eqiad.wmnet",
	            "db1138.eqiad.wmnet",
	            "db1143.eqiad.wmnet",
	            "db1142.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1147.eqiad.wmnet",
	            "db1148.eqiad.wmnet",
	            "db1149.eqiad.wmnet"
	        ],
	        "s5": [
	            "db1100.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1130.eqiad.wmnet",
	            "db1082.eqiad.wmnet",
	            "db1110.eqiad.wmnet"
	        ],
	        "s6": [
	            "db1131.eqiad.wmnet",
	            "db1088.eqiad.wmnet",
	            "db1098.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1093.eqiad.wmnet"
	        ],
	        "s7": [
	            "db1086.eqiad.wmnet",
	            "db1094.eqiad.wmnet",
	            "db1079.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1136.eqiad.wmnet",
	            "db1127.eqiad.wmnet",
	            "db1101.eqiad.wmnet",
	            "db1098.eqiad.wmnet"
	        ]
	    },
	    "page_restrictions pr_page index-mismatch-prod-extra": {
	        "s1": [
	            "db1083.eqiad.wmnet",
	            "db1106.eqiad.wmnet",
	            "db1105.eqiad.wmnet",
	            "db1119.eqiad.wmnet",
	            "db1089.eqiad.wmnet",
	            "db1107.eqiad.wmnet",
	            "db1099.eqiad.wmnet",
	            "db1118.eqiad.wmnet",
	            "db1134.eqiad.wmnet",
	            "db1091.eqiad.wmnet"
	        ],
	        "s2": [
	            "db1122.eqiad.wmnet",
	            "db1129.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1074.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1076.eqiad.wmnet",
	            "db1105.eqiad.wmnet"
	        ],
	        "s3": [
	            "db1123.eqiad.wmnet",
	            "db1078.eqiad.wmnet",
	            "db1112.eqiad.wmnet",
	            "db1075.eqiad.wmnet"
	        ],
	        "s4": [
	            "db1081.eqiad.wmnet",
	            "db1084.eqiad.wmnet",
	            "db1141.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1121.eqiad.wmnet",
	            "db1138.eqiad.wmnet",
	            "db1143.eqiad.wmnet",
	            "db1142.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1147.eqiad.wmnet",
	            "db1148.eqiad.wmnet",
	            "db1149.eqiad.wmnet"
	        ],
	        "s5": [
	            "db1100.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1130.eqiad.wmnet",
	            "db1082.eqiad.wmnet",
	            "db1110.eqiad.wmnet"
	        ],
	        "s6": [
	            "db1131.eqiad.wmnet",
	            "db1088.eqiad.wmnet",
	            "db1098.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1093.eqiad.wmnet"
	        ],
	        "s7": [
	            "db1086.eqiad.wmnet",
	            "db1094.eqiad.wmnet",
	            "db1079.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1136.eqiad.wmnet",
	            "db1127.eqiad.wmnet",
	            "db1101.eqiad.wmnet",
	            "db1098.eqiad.wmnet"
	        ]
	    },
	    "page_restrictions pr_pagetype index-mismatch-code-extra": {
	        "s1": [
	            "db1083.eqiad.wmnet",
	            "db1106.eqiad.wmnet",
	            "db1105.eqiad.wmnet",
	            "db1119.eqiad.wmnet",
	            "db1089.eqiad.wmnet",
	            "db1107.eqiad.wmnet",
	            "db1099.eqiad.wmnet",
	            "db1118.eqiad.wmnet",
	            "db1134.eqiad.wmnet",
	            "db1091.eqiad.wmnet"
	        ],
	        "s2": [
	            "db1122.eqiad.wmnet",
	            "db1129.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1074.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1076.eqiad.wmnet",
	            "db1105.eqiad.wmnet"
	        ],
	        "s3": [
	            "db1123.eqiad.wmnet",
	            "db1078.eqiad.wmnet",
	            "db1112.eqiad.wmnet",
	            "db1075.eqiad.wmnet"
	        ],
	        "s4": [
	            "db1081.eqiad.wmnet",
	            "db1084.eqiad.wmnet",
	            "db1141.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1121.eqiad.wmnet",
	            "db1138.eqiad.wmnet",
	            "db1143.eqiad.wmnet",
	            "db1142.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1147.eqiad.wmnet",
	            "db1148.eqiad.wmnet",
	            "db1149.eqiad.wmnet"
	        ],
	        "s5": [
	            "db1100.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1130.eqiad.wmnet",
	            "db1082.eqiad.wmnet",
	            "db1110.eqiad.wmnet"
	        ],
	        "s6": [
	            "db1131.eqiad.wmnet",
	            "db1088.eqiad.wmnet",
	            "db1098.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1093.eqiad.wmnet"
	        ],
	        "s7": [
	            "db1086.eqiad.wmnet",
	            "db1094.eqiad.wmnet",
	            "db1079.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1136.eqiad.wmnet",
	            "db1127.eqiad.wmnet",
	            "db1101.eqiad.wmnet",
	            "db1098.eqiad.wmnet"
	        ]
	    },

ALTER TABLE:

alter table page_restrictions drop primary key, add primary key (pr_id), drop key pr_id, add UNIQUE KEY pr_pagetype (pr_page,pr_type), drop key pr_page;

Schema change progress:

  • s8 (not needed, schema was correct)
  • labswiki (not needed, schema was correct)
  • labtestwiki (not needed, schema was correct)

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

Marostegui moved this task from Triage to Backlog on the DBA board.

Wow, this is a big mess.
I have compared a few wikis and the issues are indeed the PK being the UNIQUE and the UNIQUE being the PK on most wikis except wikidatawiki and the new ones.
There's also an extra index there that isn't on code, which is pr_page.

# mysql.py -hdb1089 -A enwiki -e "show create table page_restrictions\G"
*************************** 1. row ***************************
       Table: page_restrictions
Create Table: CREATE TABLE `page_restrictions` (
  `pr_page` int(8) NOT NULL DEFAULT '0',
  `pr_type` varbinary(255) NOT NULL DEFAULT '',
  `pr_level` varbinary(255) NOT NULL DEFAULT '',
  `pr_cascade` tinyint(4) NOT NULL DEFAULT '0',
  `pr_user` int(10) unsigned DEFAULT NULL,
  `pr_expiry` varbinary(14) DEFAULT NULL,
  `pr_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pr_page`,`pr_type`),
  UNIQUE KEY `pr_id` (`pr_id`),
  KEY `pr_page` (`pr_page`),
  KEY `pr_typelevel` (`pr_type`,`pr_level`),
  KEY `pr_level` (`pr_level`),
  KEY `pr_cascade` (`pr_cascade`)
) ENGINE=InnoDB AUTO_INCREMENT=858962 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED

And on a recently created wiki (or wikidata):

mysql.py -hdb1109 -A wikidatawiki -e "show create table page_restrictions\G"
*************************** 1. row ***************************
       Table: page_restrictions
Create Table: CREATE TABLE `page_restrictions` (
  `pr_page` int(11) NOT NULL,
  `pr_type` varbinary(60) NOT NULL,
  `pr_level` varbinary(60) NOT NULL,
  `pr_cascade` tinyint(4) NOT NULL,
  `pr_user` int(10) unsigned DEFAULT NULL,
  `pr_expiry` varbinary(14) DEFAULT NULL,
  `pr_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pr_id`),
  UNIQUE KEY `pr_pagetype` (`pr_page`,`pr_type`),
  KEY `pr_typelevel` (`pr_type`,`pr_level`),
  KEY `pr_level` (`pr_level`),
  KEY `pr_cascade` (`pr_cascade`)
) ENGINE=InnoDB AUTO_INCREMENT=8292 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

This is the alter that would fix all the issues:

alter table page_restrictions drop primary key, add primary key (pr_id), drop key pr_id, add UNIQUE KEY pr_pagetype (pr_page,pr_type), drop key pr_page;

Basically it exchanges the PK and the UNIQUE and drops the pr_page index.
That table is tiny, so I am not too concerned about bad query plans. The table max size is on enwiki and it is just 146k rows.

s6 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1005
  • db1140
  • db1139
  • db1131
  • db1125
  • db1113
  • db1098
  • db1096
  • db1093
  • db1088
  • db1085

Mentioned in SAL (#wikimedia-operations) [2020-07-27T07:00:40Z] <marostegui> Deploy schema change on s5 codfw T256682

s5 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1145
  • db1144
  • db1130
  • db1124
  • db1113
  • db1110
  • db1100
  • db1096
  • db1082

s2 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1146
  • db1129
  • db1125
  • db1122
  • db1105
  • db1095
  • db1090
  • db1076
  • db1074

s4 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1149
  • db1148
  • db1147
  • db1146
  • db1145
  • db1144
  • db1143
  • db1142
  • db1141
  • db1138
  • db1125
  • db1121
  • db1081

s7 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1136
  • db1127
  • db1125
  • db1116
  • db1101
  • db1098
  • db1094
  • db1090
  • db1086
  • db1079

s1 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1140
  • db1139
  • db1135
  • db1134
  • db1124
  • db1119
  • db1118
  • db1106
  • db1105
  • db1099
  • db1091
  • db1089
  • db1084
  • db1083

Mentioned in SAL (#wikimedia-operations) [2020-07-28T11:38:24Z] <marostegui> Deploy schema change on s3 codfw, this will generate lag on codfw T256682

s3 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1124
  • db1123
  • db1112
  • db1095
  • db1078
  • db1075
Marostegui claimed this task.
Marostegui updated the task description. (Show Details)

All done