Page MenuHomePhabricator

wikibase: synchronize schema on production with what is created on install
Closed, ResolvedPublic

Description

Synchronize schema on production with what is created on install.

Differences outlined in T85414#4597677
All differences tracked with sub tasks

Event Timeline

JanZerebecki raised the priority of this task from to Needs Triage.
JanZerebecki updated the task description. (Show Details)
JanZerebecki added a subscriber: JanZerebecki.
Lydia_Pintscher triaged this task as Normal priority.Dec 29 2014, 10:03 AM
Lydia_Pintscher set Security to None.
Lydia_Pintscher added subscribers: aude, daniel.
thiemowmde closed this task as Invalid.Dec 7 2017, 3:49 PM
mark reopened this task as Open.May 25 2018, 9:08 AM
mark raised the priority of this task from Normal to High.
mark added a project: DBA.

The only sub task is currently for tmp1, the next step for this task would be to go through all of the other schemas and check for other differences.

Addshore updated the task description. (Show Details)Aug 20 2018, 9:59 AM
Marostegui renamed this task from synchronize schema on production with what is created on install to wikibase: synchronize schema on production with what is created on install.Sep 10 2018, 3:13 PM

So I just diffed the tables and indexes on production wikidatawiki vs my local fresh install, the result can be found @ https://www.diffchecker.com/BGjXnosI.
The only differences are on the wb_terms indexes:

On prod:

+----------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name            | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| wb_terms |          0 | PRIMARY             |            1 | term_row_id         | A         |  1717619563 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | wb_terms_entity_id  |            1 | term_entity_id      | A         |   156147233 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | wb_terms_text       |            1 | term_text           | A         |   429404890 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | wb_terms_search_key |            1 | term_search_key     | A         |   156147233 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_full_entity    |            1 | term_full_entity_id | A         |    85880978 |     NULL | NULL   | YES  | BTREE      |         |               |
| wb_terms |          1 | term_search_full    |            1 | term_language       | A         |     6580917 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_search_full    |            2 | term_full_entity_id | A         |  1717619563 |     NULL | NULL   | YES  | BTREE      |         |               |
| wb_terms |          1 | term_search_full    |            3 | term_type           | A         |  1717619563 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_search_full    |            4 | term_search_key     | A         |  1717619563 |       16 | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | tmp1                |            1 | term_language       | A         |     6982193 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | tmp1                |            2 | term_type           | A         |     6709451 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | tmp1                |            3 | term_entity_type    | A         |     8718881 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | tmp1                |            4 | term_search_key     | A         |   429404890 |     NULL | NULL   |      | BTREE      |         |               |
+----------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
13 rows in set (0.00 sec)

On my local install:

MariaDB [default]> show index from wb_terms;
+----------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name         | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| wb_terms |          0 | PRIMARY          |            1 | term_row_id         | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_full_entity |            1 | term_full_entity_id | A         |           4 |     NULL | NULL   | YES  | BTREE      |         |               |
| wb_terms |          1 | term_text        |            1 | term_text           | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_text        |            2 | term_language       | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_search_key  |            1 | term_search_key     | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_search_key  |            2 | term_language       | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_search_full |            1 | term_language       | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_search_full |            2 | term_full_entity_id | A         |           4 |     NULL | NULL   | YES  | BTREE      |         |               |
| wb_terms |          1 | term_search_full |            3 | term_type           | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| wb_terms |          1 | term_search_full |            4 | term_search_key     | A         |           4 |       16 | NULL   |      | BTREE      |         |               |
+----------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
10 rows in set (0.001 sec)

Summary:

  • Things that are the same:
    • PRIMARY is the same :)
    • term_full_entity is the same :)
    • term_search_full is the same :)
  • Things only in prod:
    • wb_terms_entity_id is only in prod
    • wb_terms_text is only in prod
    • wb_terms_search_key is only in prod
    • tmp1 is only in prod (already tracked as T202265)
  • Things that are only local:
    • term_text is only on my local install
    • term_search_key is only on my local install

So the differences:

  • tmp1 we are already tracking as T202265
  • indexes that sound similar but appear different, locally they both also cover term_language, on prod they do not:
    • term_text and wb_terms_text sound similar but differ
    • term_search_key and wb_terms_search_key sound similar but differ
  • wb_terms_entity_id is only on prod, no sign of anything resembling it locally.

A quick check about unused indexes on that table reports that only wb_terms_entity_id and wb_terms_search_key aren't in use:

root@neodymium:/home/marostegui# ./section s8 | grep codfw | egrep -v "dbstore|db2094" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_name='wb_terms';";done
db2086.codfw.wmnet:3318
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key
db2085.codfw.wmnet:3318
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key
db2083.codfw.wmnet:3306
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key
db2082.codfw.wmnet:3306
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key
db2081.codfw.wmnet:3306
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key
db2080.codfw.wmnet:3306
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key
db2079.codfw.wmnet:3306
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key
db2045.codfw.wmnet:3306
object_schema	object_name	index_name
wikidatawiki	wb_terms	wb_terms_entity_id
wikidatawiki	wb_terms	wb_terms_search_key

All of the sub tasks have been created.
Once all complete we should compare the production vs install schema again and make sure we didn't miss anything before closing this.

wb_terms_entity_id only uses the "old", numeric-only term_entity_id column, hence it probably has been kind of replaced by term_full_entity index.

As mentioned above by @Marostegui it dos not seem used, so I would guess it would be fine to just drop it from production.
But sure, let's re-add to the git repo, and have it removed once we've dropped from the production DBs. This seems a sane way of proceeding to me.

wb_terms_entity_id only uses the "old", numeric-only term_entity_id column, hence it probably has been kind of replaced by term_full_entity index.
As mentioned above by @Marostegui it dos not seem used, so I would guess it would be fine to just drop it from production.
But sure, let's re-add to the git repo, and have it removed once we've dropped from the production DBs. This seems a sane way of proceeding to me.

I am not sure if it is worth the time and specially the risk, if wb_terms is going to disappear at some point

Restricted Application added a project: User-Addshore. · View Herald TranscriptNov 14 2018, 1:22 PM