Page MenuHomePhabricator

page_restrictions indexes have been majestically drifting from code
Closed, ResolvedPublic

Description

The report gives a long list of drifts (raw report below).

A db in master has indexes like this (my localhost):

MariaDB [client]> show indexes from page_restrictions;
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table             | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| page_restrictions |          0 | PRIMARY      |            1 | pr_id       | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | pr_pagetype  |            1 | pr_page     | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | pr_pagetype  |            2 | pr_type     | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            1 | pr_type     | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            2 | pr_level    | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_level     |            1 | pr_level    | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_cascade   |            1 | pr_cascade  | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
7 rows in set (0.001 sec)

In production, it's like this:

wikiadmin@10.64.0.214(enwiki)> show indexes from page_restrictions;
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table             | Non_unique | Key_name     | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| page_restrictions |          0 | PRIMARY      |            1 | pr_page     | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | PRIMARY      |            2 | pr_type     | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          0 | pr_id        |            1 | pr_id       | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_page      |            1 | pr_page     | A         |      142725 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            1 | pr_type     | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_typelevel |            2 | pr_level    | A         |          18 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_level     |            1 | pr_level    | A         |           6 |     NULL | NULL   |      | BTREE      |         |               |
| page_restrictions |          1 | pr_cascade   |            1 | pr_cascade  | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
+-------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
8 rows in set (0.00 sec)

(It's not like this everywhere, s8 doesn't have this issue)

From what I understand it's the issue of PK not being pr_id but being pr_page, pr_type instead (and pr_page,pr_type should have its own unique index named pr_pagetype)


The raw report:

	    "page_restrictions pr_id index-mismatch-prod-extra": {
	        "s1": [
	            "db1083.eqiad.wmnet",
	            "db1106.eqiad.wmnet",
	            "db1105.eqiad.wmnet",
	            "db1119.eqiad.wmnet",
	            "db1089.eqiad.wmnet",
	            "db1107.eqiad.wmnet",
	            "db1099.eqiad.wmnet",
	            "db1118.eqiad.wmnet",
	            "db1134.eqiad.wmnet",
	            "db1091.eqiad.wmnet"
	        ],
	        "s2": [
	            "db1122.eqiad.wmnet",
	            "db1129.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1074.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1076.eqiad.wmnet",
	            "db1105.eqiad.wmnet"
	        ],
	        "s3": [
	            "db1123.eqiad.wmnet",
	            "db1078.eqiad.wmnet",
	            "db1112.eqiad.wmnet",
	            "db1075.eqiad.wmnet"
	        ],
	        "s4": [
	            "db1081.eqiad.wmnet",
	            "db1084.eqiad.wmnet",
	            "db1141.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1121.eqiad.wmnet",
	            "db1138.eqiad.wmnet",
	            "db1143.eqiad.wmnet",
	            "db1142.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1147.eqiad.wmnet",
	            "db1148.eqiad.wmnet",
	            "db1149.eqiad.wmnet"
	        ],
	        "s5": [
	            "db1100.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1130.eqiad.wmnet",
	            "db1082.eqiad.wmnet",
	            "db1110.eqiad.wmnet"
	        ],
	        "s6": [
	            "db1131.eqiad.wmnet",
	            "db1088.eqiad.wmnet",
	            "db1098.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1093.eqiad.wmnet"
	        ],
	        "s7": [
	            "db1086.eqiad.wmnet",
	            "db1094.eqiad.wmnet",
	            "db1079.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1136.eqiad.wmnet",
	            "db1127.eqiad.wmnet",
	            "db1101.eqiad.wmnet",
	            "db1098.eqiad.wmnet"
	        ]
	    },
	    "page_restrictions pr_page index-mismatch-prod-extra": {
	        "s1": [
	            "db1083.eqiad.wmnet",
	            "db1106.eqiad.wmnet",
	            "db1105.eqiad.wmnet",
	            "db1119.eqiad.wmnet",
	            "db1089.eqiad.wmnet",
	            "db1107.eqiad.wmnet",
	            "db1099.eqiad.wmnet",
	            "db1118.eqiad.wmnet",
	            "db1134.eqiad.wmnet",
	            "db1091.eqiad.wmnet"
	        ],
	        "s2": [
	            "db1122.eqiad.wmnet",
	            "db1129.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1074.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1076.eqiad.wmnet",
	            "db1105.eqiad.wmnet"
	        ],
	        "s3": [
	            "db1123.eqiad.wmnet",
	            "db1078.eqiad.wmnet",
	            "db1112.eqiad.wmnet",
	            "db1075.eqiad.wmnet"
	        ],
	        "s4": [
	            "db1081.eqiad.wmnet",
	            "db1084.eqiad.wmnet",
	            "db1141.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1121.eqiad.wmnet",
	            "db1138.eqiad.wmnet",
	            "db1143.eqiad.wmnet",
	            "db1142.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1147.eqiad.wmnet",
	            "db1148.eqiad.wmnet",
	            "db1149.eqiad.wmnet"
	        ],
	        "s5": [
	            "db1100.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1130.eqiad.wmnet",
	            "db1082.eqiad.wmnet",
	            "db1110.eqiad.wmnet"
	        ],
	        "s6": [
	            "db1131.eqiad.wmnet",
	            "db1088.eqiad.wmnet",
	            "db1098.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1093.eqiad.wmnet"
	        ],
	        "s7": [
	            "db1086.eqiad.wmnet",
	            "db1094.eqiad.wmnet",
	            "db1079.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1136.eqiad.wmnet",
	            "db1127.eqiad.wmnet",
	            "db1101.eqiad.wmnet",
	            "db1098.eqiad.wmnet"
	        ]
	    },
	    "page_restrictions pr_pagetype index-mismatch-code-extra": {
	        "s1": [
	            "db1083.eqiad.wmnet",
	            "db1106.eqiad.wmnet",
	            "db1105.eqiad.wmnet",
	            "db1119.eqiad.wmnet",
	            "db1089.eqiad.wmnet",
	            "db1107.eqiad.wmnet",
	            "db1099.eqiad.wmnet",
	            "db1118.eqiad.wmnet",
	            "db1134.eqiad.wmnet",
	            "db1091.eqiad.wmnet"
	        ],
	        "s2": [
	            "db1122.eqiad.wmnet",
	            "db1129.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1074.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1076.eqiad.wmnet",
	            "db1105.eqiad.wmnet"
	        ],
	        "s3": [
	            "db1123.eqiad.wmnet",
	            "db1078.eqiad.wmnet",
	            "db1112.eqiad.wmnet",
	            "db1075.eqiad.wmnet"
	        ],
	        "s4": [
	            "db1081.eqiad.wmnet",
	            "db1084.eqiad.wmnet",
	            "db1141.eqiad.wmnet",
	            "db1146.eqiad.wmnet",
	            "db1121.eqiad.wmnet",
	            "db1138.eqiad.wmnet",
	            "db1143.eqiad.wmnet",
	            "db1142.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1147.eqiad.wmnet",
	            "db1148.eqiad.wmnet",
	            "db1149.eqiad.wmnet"
	        ],
	        "s5": [
	            "db1100.eqiad.wmnet",
	            "db1144.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1130.eqiad.wmnet",
	            "db1082.eqiad.wmnet",
	            "db1110.eqiad.wmnet"
	        ],
	        "s6": [
	            "db1131.eqiad.wmnet",
	            "db1088.eqiad.wmnet",
	            "db1098.eqiad.wmnet",
	            "db1096.eqiad.wmnet",
	            "db1113.eqiad.wmnet",
	            "db1093.eqiad.wmnet"
	        ],
	        "s7": [
	            "db1086.eqiad.wmnet",
	            "db1094.eqiad.wmnet",
	            "db1079.eqiad.wmnet",
	            "db1090.eqiad.wmnet",
	            "db1136.eqiad.wmnet",
	            "db1127.eqiad.wmnet",
	            "db1101.eqiad.wmnet",
	            "db1098.eqiad.wmnet"
	        ]
	    },

ALTER TABLE:

alter table page_restrictions drop primary key, add primary key (pr_id), drop key pr_id, add UNIQUE KEY pr_pagetype (pr_page,pr_type), drop key pr_page;

Schema change progress:

  • s8 (not needed, schema was correct)
  • labswiki (not needed, schema was correct)
  • labtestwiki (not needed, schema was correct)

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 29 2020, 8:15 PM
Marostegui triaged this task as Medium priority.Jun 30 2020, 4:43 AM
Marostegui moved this task from Triage to Backlog on the DBA board.
Marostegui moved this task from Backlog to Next on the DBA board.Jul 22 2020, 8:22 AM

Wow, this is a big mess.
I have compared a few wikis and the issues are indeed the PK being the UNIQUE and the UNIQUE being the PK on most wikis except wikidatawiki and the new ones.
There's also an extra index there that isn't on code, which is pr_page.

# mysql.py -hdb1089 -A enwiki -e "show create table page_restrictions\G"
*************************** 1. row ***************************
       Table: page_restrictions
Create Table: CREATE TABLE `page_restrictions` (
  `pr_page` int(8) NOT NULL DEFAULT '0',
  `pr_type` varbinary(255) NOT NULL DEFAULT '',
  `pr_level` varbinary(255) NOT NULL DEFAULT '',
  `pr_cascade` tinyint(4) NOT NULL DEFAULT '0',
  `pr_user` int(10) unsigned DEFAULT NULL,
  `pr_expiry` varbinary(14) DEFAULT NULL,
  `pr_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pr_page`,`pr_type`),
  UNIQUE KEY `pr_id` (`pr_id`),
  KEY `pr_page` (`pr_page`),
  KEY `pr_typelevel` (`pr_type`,`pr_level`),
  KEY `pr_level` (`pr_level`),
  KEY `pr_cascade` (`pr_cascade`)
) ENGINE=InnoDB AUTO_INCREMENT=858962 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED

And on a recently created wiki (or wikidata):

mysql.py -hdb1109 -A wikidatawiki -e "show create table page_restrictions\G"
*************************** 1. row ***************************
       Table: page_restrictions
Create Table: CREATE TABLE `page_restrictions` (
  `pr_page` int(11) NOT NULL,
  `pr_type` varbinary(60) NOT NULL,
  `pr_level` varbinary(60) NOT NULL,
  `pr_cascade` tinyint(4) NOT NULL,
  `pr_user` int(10) unsigned DEFAULT NULL,
  `pr_expiry` varbinary(14) DEFAULT NULL,
  `pr_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`pr_id`),
  UNIQUE KEY `pr_pagetype` (`pr_page`,`pr_type`),
  KEY `pr_typelevel` (`pr_type`,`pr_level`),
  KEY `pr_level` (`pr_level`),
  KEY `pr_cascade` (`pr_cascade`)
) ENGINE=InnoDB AUTO_INCREMENT=8292 DEFAULT CHARSET=binary ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8

This is the alter that would fix all the issues:

alter table page_restrictions drop primary key, add primary key (pr_id), drop key pr_id, add UNIQUE KEY pr_pagetype (pr_page,pr_type), drop key pr_page;

Basically it exchanges the PK and the UNIQUE and drops the pr_page index.
That table is tiny, so I am not too concerned about bad query plans. The table max size is on enwiki and it is just 146k rows.

Marostegui updated the task description. (Show Details)Jul 23 2020, 1:18 PM
Marostegui updated the task description. (Show Details)Jul 23 2020, 1:21 PM
Marostegui updated the task description. (Show Details)Jul 23 2020, 1:58 PM
Marostegui moved this task from Next to In progress on the DBA board.Jul 24 2020, 8:25 AM

s6 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1005
  • db1140
  • db1139
  • db1131
  • db1125
  • db1113
  • db1098
  • db1096
  • db1093
  • db1088
  • db1085
Marostegui updated the task description. (Show Details)Jul 27 2020, 6:22 AM
Marostegui updated the task description. (Show Details)Jul 27 2020, 7:00 AM

Mentioned in SAL (#wikimedia-operations) [2020-07-27T07:00:40Z] <marostegui> Deploy schema change on s5 codfw T256682

s5 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1145
  • db1144
  • db1130
  • db1124
  • db1113
  • db1110
  • db1100
  • db1096
  • db1082
Marostegui updated the task description. (Show Details)Jul 27 2020, 7:52 AM
Marostegui updated the task description. (Show Details)Jul 27 2020, 8:21 AM

s2 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1146
  • db1129
  • db1125
  • db1122
  • db1105
  • db1095
  • db1090
  • db1076
  • db1074
Marostegui updated the task description. (Show Details)Jul 27 2020, 9:11 AM
Marostegui updated the task description. (Show Details)Jul 27 2020, 9:14 AM

s4 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1149
  • db1148
  • db1147
  • db1146
  • db1145
  • db1144
  • db1143
  • db1142
  • db1141
  • db1138
  • db1125
  • db1121
  • db1081
Marostegui updated the task description. (Show Details)Jul 27 2020, 9:22 AM
Marostegui updated the task description. (Show Details)Jul 27 2020, 12:13 PM

s7 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1136
  • db1127
  • db1125
  • db1116
  • db1101
  • db1098
  • db1094
  • db1090
  • db1086
  • db1079
Marostegui updated the task description. (Show Details)Jul 27 2020, 12:24 PM
Marostegui updated the task description. (Show Details)Jul 28 2020, 8:10 AM
Marostegui added a comment.EditedJul 28 2020, 8:18 AM

s1 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1003
  • db1140
  • db1139
  • db1135
  • db1134
  • db1124
  • db1119
  • db1118
  • db1106
  • db1105
  • db1099
  • db1091
  • db1089
  • db1084
  • db1083
Marostegui updated the task description. (Show Details)Jul 28 2020, 8:27 AM

Mentioned in SAL (#wikimedia-operations) [2020-07-28T11:38:24Z] <marostegui> Deploy schema change on s3 codfw, this will generate lag on codfw T256682

Marostegui updated the task description. (Show Details)Jul 28 2020, 12:02 PM
Marostegui added a comment.EditedJul 28 2020, 12:31 PM

s3 eqiad progress

  • labsdb1012
  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore1004
  • db1124
  • db1123
  • db1112
  • db1095
  • db1078
  • db1075
Marostegui closed this task as Resolved.Jul 30 2020, 5:21 AM
Marostegui claimed this task.
Marostegui updated the task description. (Show Details)

All done