Schema change for adding indexes of ct_tag_id
Closed, ResolvedPublic

Description

After T193874: Add new indexes to change_tag table using ct_tag_id instead of ct_tag It would be great to apply this change

From the checklist at https://wikitech.wikimedia.org/wiki/Schema_changes:

  1. ALTERs to run: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/456027/11/maintenance/archives/patch-change_tag-change_tag_rc_tag_id.sql (with this small fix)
  2. Where to run those changes: all.dblist
  3. When to run those changes: ?
  4. If the schema change is backwards compatible: Yes.
  5. If the schema change has been tested already on some of the test/beta wikis: Tested in beta cluster
  6. if the data should be made available on the labs replicas and/or dumps: Yes, data in this table is public (unless the wiki is private which we don't replicate at all)

Progress:

s1

s2

  • codfw
  • eqiad

s3

  • codfw
  • eqiad

s4

  • codfw
  • eqiad

s5

s6

  • codfw
  • eqiad

s7

  • codfw
  • eqiad

s8

wikitech

  • labsdb
  • labtestwiki

Related Objects

There are a very large number of changes, so older changes are hidden. Show Older Changes

I will try to get this deployed in eqiad before we switch back.

That's awesome, Thank you! Let me know if anything is not working as expected.

I am planning to deploy this first on an active enwiki (codfw) replica, and leave it for a couple of days to make sure nothing gets weird with its reads or writes, before deploying it to eqiad.

Krinkle moved this task from Backlog to Schema on the MediaWiki-Database board.Sep 17 2018, 3:48 PM

Mentioned in SAL (#wikimedia-operations) [2018-09-20T06:09:37Z] <marostegui> Deploy schema change on db2070 - T203709

I have altered db2070 (enwiki) and I will leave it like that for a few days before going ahead on eqiad directly on the master.

Marostegui added a comment.EditedSep 20 2018, 6:38 AM

s1 progress:

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore2002
  • dbstore1002
  • dbstore1001
  • db2094
  • db2092
  • db2088
  • db2085
  • db2072
  • db2071
  • db2070
  • db2062
  • db2055
  • db2048
  • db1124
  • db1119
  • db1118 (replication broken - test host)
  • db1114
  • db1106
  • db1105
  • db1099
  • db1089
  • db1083
  • db1080
  • db1067
Marostegui updated the task description. (Show Details)Sep 20 2018, 6:39 AM

I have altered db2070 (enwiki) and I will leave it like that for a few days before going ahead on eqiad directly on the master.

I have not seen any issues on this hosts. The graphs looks good. Nothing seems to have changed on its query read patterns. So on Monday I will deploy this change to an API and a Recentchanges host (db2070 is main traffic only) on enwiki too to see if there is any impact and if not, I will start getting this deployed on all eqiad.

Mentioned in SAL (#wikimedia-operations) [2018-09-24T05:28:41Z] <marostegui> Deploy schema change on db2088:3311 - T203709

Marostegui updated the task description. (Show Details)Sep 24 2018, 6:58 AM

I have altered db2070 (enwiki) and I will leave it like that for a few days before going ahead on eqiad directly on the master.

I have not seen any issues on this hosts. The graphs looks good. Nothing seems to have changed on its query read patterns. So on Monday I will deploy this change to an API and a Recentchanges host (db2070 is main traffic only) on enwiki too to see if there is any impact and if not, I will start getting this deployed on all eqiad.

Altered db2088:3311 (recentchanges) and db2062 (API). Let's leave it for a day and see if there is any changes in query pattern before going ahead in eqiad.

Everything looks fine with db2088, db2062 and db2070 query patterns.
I am going to deploy this change on s5 codfw first (leaving eqiad aside) to make sure nothing breaks on eqiad (replication wise) as this is the situation we'll have once we are back on eqiad as active: the alter will be done on the primary master (eqiad) but not on the DC master (codfw).

Marostegui added a comment.EditedSep 24 2018, 1:23 PM

s5 progress:

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore2001
  • dbstore1002
  • db2094
  • db2089
  • db2084
  • db2075
  • db2066
  • db2059
  • db2052
  • db2038
  • db1124
  • db1113
  • db1110
  • db1102
  • db1100
  • db1097
  • db1096
  • db1082
  • db1070
Marostegui updated the task description. (Show Details)Sep 24 2018, 1:24 PM
Marostegui updated the task description. (Show Details)Sep 25 2018, 6:15 AM

I am going to deploy this change on s5 codfw first (leaving eqiad aside) to make sure nothing breaks on eqiad (replication wise) as this is the situation we'll have once we are back on eqiad as active: the alter will be done on the primary master (eqiad) but not on the DC master (codfw).

This is now done on s5, I am going to leave it running for a few hours to make sure nothing weird, replication-wise, happens on eqiad.

Marostegui added a comment.EditedSep 25 2018, 9:40 AM

So far so good having s5 fully codfw done with the schema change and not eqiad. I will wait a couple of more hours before deploying it on eqiad.

Mentioned in SAL (#wikimedia-operations) [2018-09-25T13:07:07Z] <marostegui> Deploy schema change on s5 eqiad - this might generate lag on s5 eqiad - T203709

Marostegui updated the task description. (Show Details)Sep 25 2018, 1:13 PM

Mentioned in SAL (#wikimedia-operations) [2018-09-25T13:13:26Z] <marostegui> Deploy schema change on s6 eqiad - this might generate lag on s6 eqiad - T203709

Marostegui updated the task description. (Show Details)Sep 25 2018, 1:25 PM

Mentioned in SAL (#wikimedia-operations) [2018-09-25T13:29:51Z] <marostegui> Deploy schema change on s2 eqiad - this might generate lag on s2 eqiad - T203709

Marostegui updated the task description. (Show Details)Sep 25 2018, 1:47 PM

Mentioned in SAL (#wikimedia-operations) [2018-09-26T05:20:11Z] <marostegui> Deploy schema change on s8 eqiad, this will generate lag - T203709

Marostegui updated the task description. (Show Details)Sep 26 2018, 8:14 AM

Mentioned in SAL (#wikimedia-operations) [2018-09-26T09:06:48Z] <marostegui> Deploy schema change on db2092 - T203709

Mentioned in SAL (#wikimedia-operations) [2018-09-26T12:33:14Z] <marostegui> Deploy schema change on s4 eqiad, will generate lag - T203709

Marostegui updated the task description. (Show Details)Sep 26 2018, 12:56 PM

Mentioned in SAL (#wikimedia-operations) [2018-09-26T13:12:34Z] <marostegui> Deploy schema change on db2071 - T203709

Mentioned in SAL (#wikimedia-operations) [2018-09-27T05:41:44Z] <marostegui> Deploy schema change on s1 eqiad master, lag will be generated - T203709

Marostegui updated the task description. (Show Details)Sep 27 2018, 8:09 AM

Mentioned in SAL (#wikimedia-operations) [2018-09-27T08:39:32Z] <marostegui> Deploy schema change on labtestwiki - T203709

Marostegui updated the task description. (Show Details)Sep 27 2018, 8:44 AM

Mentioned in SAL (#wikimedia-operations) [2018-09-27T08:48:49Z] <marostegui> Deploy schema change on labswiki (db1073 master) - T203709

Marostegui updated the task description. (Show Details)Sep 27 2018, 8:51 AM

Mentioned in SAL (#wikimedia-operations) [2018-09-28T05:54:41Z] <marostegui> Deploy schema change on s7 eqiad, this will generate lag - T203709

Mentioned in SAL (#wikimedia-operations) [2018-09-28T06:07:20Z] <marostegui> Deploy schema change on db2055 - T203709

Marostegui updated the task description. (Show Details)Sep 28 2018, 6:13 AM

Mentioned in SAL (#wikimedia-operations) [2018-09-28T08:11:10Z] <marostegui> Deploy schema change on s3 eqiad, this will generate lag - T203709

Marostegui updated the task description. (Show Details)Sep 28 2018, 8:35 AM

This is all done in eqiad. Once we are back in eqiad this will be deployed on codfw.
s1 codfw is done, with the exception of db2048 (master) which I will alter on Monday.

Mentioned in SAL (#wikimedia-operations) [2018-10-01T05:07:23Z] <marostegui> Deploy schema change on s1 codfw msater - T203709

Restricted Application added a project: Wikidata. · View Herald TranscriptOct 1 2018, 5:07 AM
Marostegui updated the task description. (Show Details)Oct 1 2018, 5:35 AM

db2048 (s1 codfw master) done.
Waiting now for eqiad to be back to finish off this.

Mentioned in SAL (#wikimedia-operations) [2018-10-02T10:24:53Z] <marostegui> Deploy schema change on db2092 - T203709

Addshore moved this task from incoming to in progress on the Wikidata board.Oct 9 2018, 9:37 AM

It's not done in s8 eqiad causing this T207313: Some administrative and log actions on Wikidata take longer than 60 seconds and time out on Wikidata:

wikiadmin@10.64.48.172(wikidatawiki)> show index from change_tag;
+------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table      | Non_unique | Key_name             | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| change_tag |          0 | PRIMARY              |            1 | ct_id       | A         |   222003497 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          0 | change_tag_rc_tag    |            1 | ct_rc_id    | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_rc_tag    |            2 | ct_tag      | A         |   222003497 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          0 | change_tag_log_tag   |            1 | ct_log_id   | A         |        1060 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_log_tag   |            2 | ct_tag      | A         |       41210 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          0 | change_tag_rev_tag   |            1 | ct_rev_id   | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_rev_tag   |            2 | ct_tag      | A         |   222003497 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id    |            1 | ct_tag      | A         |       30971 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id    |            2 | ct_rc_id    | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id    |            3 | ct_rev_id   | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id    |            4 | ct_log_id   | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id |            1 | ct_tag_id   | A         |       33591 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id |            2 | ct_rc_id    | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id |            3 | ct_rev_id   | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id |            4 | ct_log_id   | A         |   222003497 |     NULL | NULL   | YES  | BTREE      |         |               |
+------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
15 rows in set (0.00 sec)

(Compare it with enwiki):

wikiadmin@10.64.32.115(enwiki)> show index from change_tag;
+------------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table      | Non_unique | Key_name                   | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| change_tag |          0 | PRIMARY                    |            1 | ct_id       | A         |    56055095 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          0 | change_tag_rc_tag_id       |            1 | ct_rc_id    | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_rc_tag_id       |            2 | ct_tag_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_log_tag_id      |            1 | ct_log_id   | A         |     2242203 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_log_tag_id      |            2 | ct_tag_id   | A         |       50363 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_rev_tag_id      |            1 | ct_rev_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          0 | change_tag_rev_tag_id      |            2 | ct_tag_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id          |            1 | ct_tag      | A         |       36423 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id          |            2 | ct_rc_id    | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id          |            3 | ct_rev_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id          |            4 | ct_log_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id       |            1 | ct_tag_id   | A         |       44629 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id       |            2 | ct_rc_id    | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id       |            3 | ct_rev_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_tag_id_id       |            4 | ct_log_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_rc_tag_nonuniq  |            1 | ct_rc_id    | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_rc_tag_nonuniq  |            2 | ct_tag      | A         |    56055095 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          1 | change_tag_log_tag_nonuniq |            1 | ct_log_id   | A         |      849319 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_log_tag_nonuniq |            2 | ct_tag      | A         |       39447 |     NULL | NULL   |      | BTREE      |         |               |
| change_tag |          1 | change_tag_rev_tag_nonuniq |            1 | ct_rev_id   | A         |    56055095 |     NULL | NULL   | YES  | BTREE      |         |               |
| change_tag |          1 | change_tag_rev_tag_nonuniq |            2 | ct_tag      | A         |    56055095 |     NULL | NULL   |      | BTREE      |         |               |
+------------+------------+----------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
21 rows in set (0.00 sec)

FYI. That wasted two days of mine trying to understand what is wrong with the query and trying to find a better query for that

Addshore added a subscriber: Addshore.EditedNov 6 2018, 12:17 PM

@Marostegui Do we have an ETA on these indexes being re added so that we can resolve T208846 ?
I see your away, so I'll ping otherson IRC.

@Marostegui Do we have an ETA on these indexes being re added so that we can resolve T208846 ?
I see your away, so I'll ping otherson IRC.

I am back from holidays today. I will start working on this, hopefully this week once I have caught up

Marostegui added a comment.EditedNov 12 2018, 6:07 AM

This is very weird, I am _pretty_ sure I did s8 eqiad whilst codfw was active.
The fact that this exists on the master, the sanitarium master, sanitarium, and labs confuses me, not sure it this might have been reverted while we were working on the s8 incident (data loss and some tables reconstruction)

s8 progress:

  • labsdb1011
  • labsdb1010
  • labsdb1009
  • dbstore2001
  • dbstore1002
  • db2094
  • db2086
  • db2085
  • db2083
  • db2082
  • db2081
  • db2080
  • db2079
  • db2045
  • db1124
  • db1116
  • db1109
  • db1104
  • db1101
  • db1099
  • db1092
  • db1087
  • db1071
Marostegui updated the task description. (Show Details)Nov 12 2018, 6:08 AM

Change 472944 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1099:3318

https://gerrit.wikimedia.org/r/472944

Change 472944 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1099:3318

https://gerrit.wikimedia.org/r/472944

Mentioned in SAL (#wikimedia-operations) [2018-11-12T08:53:46Z] <marostegui> Deploy schema change on db1099:3318 - T203709

Change 472955 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1101:3318

https://gerrit.wikimedia.org/r/472955

Change 472955 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1101:3318

https://gerrit.wikimedia.org/r/472955

Mentioned in SAL (#wikimedia-operations) [2018-11-12T13:59:00Z] <marostegui> Deploy schema change on db1101:3318 - T203709

I know realised why the change was "gone" from all the eqiad s8 hosts except db1071, db1087,db1124 and labs. The reason is that the rest of the hosts were recloned from codfw (where the change wasn't deployed) as part of T206743: S8 replication issues leading to rows missing during eqiad -> codfw switch (Was: "A few lexemes disappeared")

Could you check the list of schema changes and maintenance to be ran during switchover to test if they where also undone?

Could you check the list of schema changes and maintenance to be ran during switchover to test if they where also undone?

Yep, I was doing it that now :-)

Change 473152 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1109

https://gerrit.wikimedia.org/r/473152

Change 473152 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1109

https://gerrit.wikimedia.org/r/473152

Mentioned in SAL (#wikimedia-operations) [2018-11-13T05:34:53Z] <marostegui> Deploy schema change on db1109 T203709

Change 473158 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1104

https://gerrit.wikimedia.org/r/473158

Change 473158 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1104

https://gerrit.wikimedia.org/r/473158

Mentioned in SAL (#wikimedia-operations) [2018-11-13T07:21:15Z] <marostegui> Deploy schema change on db1104 T203709

Change 473165 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool db1092

https://gerrit.wikimedia.org/r/473165

Change 473165 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool db1092

https://gerrit.wikimedia.org/r/473165

Mentioned in SAL (#wikimedia-operations) [2018-11-13T09:23:49Z] <marostegui> Deploy schema change on db1092 T203709

Mentioned in SAL (#wikimedia-operations) [2018-11-13T10:47:11Z] <marostegui> Deploy schema change on db1116:3318 T203709

@Ladsgroup All done in s8 eqiad core hosts (T203709#4738599). You shouldn't be seeing what you were seeing before

Mentioned in SAL (#wikimedia-operations) [2018-11-13T13:41:14Z] <marostegui> Deploy schema change on s8 codfw master (db2045) this will generate lag on s8 codfw - T203709

Marostegui updated the task description. (Show Details)Nov 13 2018, 2:11 PM
Marostegui updated the task description. (Show Details)Nov 13 2018, 3:19 PM

Mentioned in SAL (#wikimedia-operations) [2018-11-14T06:40:26Z] <marostegui> Deploy schema change on s6 codfw master, this will generate lag on s6 codfw -T203709

Marostegui updated the task description. (Show Details)Wed, Nov 14, 6:49 AM

Mentioned in SAL (#wikimedia-operations) [2018-11-14T06:52:35Z] <marostegui> Deploy schema change on s4 codfw master, this will generate lag on s4 codfw - T203709

Marostegui updated the task description. (Show Details)Wed, Nov 14, 7:06 AM

Mentioned in SAL (#wikimedia-operations) [2018-11-14T07:07:26Z] <marostegui> Deploy schema change on s2 codfw master, this will generate lag on s2 codfw - T203709

Marostegui updated the task description. (Show Details)Wed, Nov 14, 7:18 AM

Mentioned in SAL (#wikimedia-operations) [2018-11-14T07:18:59Z] <marostegui> Deploy schema change on s7 codfw master, this will generate lag on s7 codfw - T203709

Marostegui updated the task description. (Show Details)Wed, Nov 14, 7:41 AM

Mentioned in SAL (#wikimedia-operations) [2018-11-14T07:42:45Z] <marostegui> Deploy schema change on s3 codfw master, this will generate lag on s3 codfw - T203709

Marostegui updated the task description. (Show Details)Wed, Nov 14, 8:02 AM
Marostegui closed this task as Resolved.

This is all done