Synchronize schema on production with what is created on install.
Differences outlined in T85414#4597677
All differences tracked with sub tasks
Synchronize schema on production with what is created on install.
Differences outlined in T85414#4597677
All differences tracked with sub tasks
This being out of whack may eventually have caused a site outage T195520: Multiple projects reporting Cannot access the database: No working replica DB server, see report at https://wikitech.wikimedia.org/wiki/Incident_documentation/20180524-wikidata
The only sub task is currently for tmp1, the next step for this task would be to go through all of the other schemas and check for other differences.
So I just diffed the tables and indexes on production wikidatawiki vs my local fresh install, the result can be found @ https://www.diffchecker.com/BGjXnosI.
The only differences are on the wb_terms indexes:
On prod:
+----------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +----------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | wb_terms | 0 | PRIMARY | 1 | term_row_id | A | 1717619563 | NULL | NULL | | BTREE | | | | wb_terms | 1 | wb_terms_entity_id | 1 | term_entity_id | A | 156147233 | NULL | NULL | | BTREE | | | | wb_terms | 1 | wb_terms_text | 1 | term_text | A | 429404890 | NULL | NULL | | BTREE | | | | wb_terms | 1 | wb_terms_search_key | 1 | term_search_key | A | 156147233 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_full_entity | 1 | term_full_entity_id | A | 85880978 | NULL | NULL | YES | BTREE | | | | wb_terms | 1 | term_search_full | 1 | term_language | A | 6580917 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_search_full | 2 | term_full_entity_id | A | 1717619563 | NULL | NULL | YES | BTREE | | | | wb_terms | 1 | term_search_full | 3 | term_type | A | 1717619563 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_search_full | 4 | term_search_key | A | 1717619563 | 16 | NULL | | BTREE | | | | wb_terms | 1 | tmp1 | 1 | term_language | A | 6982193 | NULL | NULL | | BTREE | | | | wb_terms | 1 | tmp1 | 2 | term_type | A | 6709451 | NULL | NULL | | BTREE | | | | wb_terms | 1 | tmp1 | 3 | term_entity_type | A | 8718881 | NULL | NULL | | BTREE | | | | wb_terms | 1 | tmp1 | 4 | term_search_key | A | 429404890 | NULL | NULL | | BTREE | | | +----------+------------+---------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 13 rows in set (0.00 sec)
On my local install:
MariaDB [default]> show index from wb_terms; +----------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +----------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | wb_terms | 0 | PRIMARY | 1 | term_row_id | A | 4 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_full_entity | 1 | term_full_entity_id | A | 4 | NULL | NULL | YES | BTREE | | | | wb_terms | 1 | term_text | 1 | term_text | A | 4 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_text | 2 | term_language | A | 4 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_search_key | 1 | term_search_key | A | 4 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_search_key | 2 | term_language | A | 4 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_search_full | 1 | term_language | A | 4 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_search_full | 2 | term_full_entity_id | A | 4 | NULL | NULL | YES | BTREE | | | | wb_terms | 1 | term_search_full | 3 | term_type | A | 4 | NULL | NULL | | BTREE | | | | wb_terms | 1 | term_search_full | 4 | term_search_key | A | 4 | 16 | NULL | | BTREE | | | +----------+------------+------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 10 rows in set (0.001 sec)
Summary:
So the differences:
A quick check about unused indexes on that table reports that only wb_terms_entity_id and wb_terms_search_key aren't in use:
root@neodymium:/home/marostegui# ./section s8 | grep codfw | egrep -v "dbstore|db2094" | while read host port; do echo "$host:$port"; mysql.py -h$host:$port sys -e "select * from schema_unused_indexes where object_name='wb_terms';";done db2086.codfw.wmnet:3318 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key db2085.codfw.wmnet:3318 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key db2083.codfw.wmnet:3306 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key db2082.codfw.wmnet:3306 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key db2081.codfw.wmnet:3306 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key db2080.codfw.wmnet:3306 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key db2079.codfw.wmnet:3306 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key db2045.codfw.wmnet:3306 object_schema object_name index_name wikidatawiki wb_terms wb_terms_entity_id wikidatawiki wb_terms wb_terms_search_key
All of the sub tasks have been created.
Once all complete we should compare the production vs install schema again and make sure we didn't miss anything before closing this.
wb_terms_entity_id only uses the "old", numeric-only term_entity_id column, hence it probably has been kind of replaced by term_full_entity index.
As mentioned above by @Marostegui it dos not seem used, so I would guess it would be fine to just drop it from production.
But sure, let's re-add to the git repo, and have it removed once we've dropped from the production DBs. This seems a sane way of proceeding to me.
I am not sure if it is worth the time and specially the risk, if wb_terms is going to disappear at some point