One good thing about having abstract schema is that you can spot issues much easier. One of the database conventions is to have table prefix for all columns and indexes (e.g. logging table has prefix of log_, all columns and index names should start with log_).
Given that 80% of tables are migrated to abstract schema, I quickly wrote this python script to check:
import json with open('/var/lib/mediawiki/maintenance/tables.json', 'r') as f: tables = json.loads(f.read()) for table in tables: prefixes = [] for column in table.get('columns', []): if 'name' in column: prefixes.append(column['name'].split('_')[0]) for index in table.get('indexes', []): if 'name' in index: prefixes.append(index['name'].split('_')[0]) if len(set(prefixes)) != 1: print(table['name'], prefixes)
And the result is these tables:
- site_identifiers ['si', 'si', 'si', 'site', 'site']
- user_properties ['up', 'up', 'up', 'user']
- sites ['site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'sites', 'sites', 'sites', 'sites', 'sites', 'sites', 'sites', 'sites']
- user_newtalk ['user', 'user', 'user', 'un', 'un']
-
revision_actor_temp ['revactor', 'revactor', 'revactor', 'revactor', 'revactor', 'actor', 'page'] - change_tag ['ct', 'ct', 'ct', 'ct', 'ct', 'ct', 'change', 'change', 'change', 'change']
- page ['page', 'page', 'page', ' page', 'page', 'page', 'page', 'page', 'page', 'page', 'page', 'page', 'page', 'name', 'page', 'page', 'page']
- objectcache ['keyname', 'value', 'exptime', 'exptime']
We should fix these and then easily add a unit test to avoid new cases being introduced in future