Page MenuHomePhabricator

Fix and enforce table prefix usage in columns and indexes in core
Open, Needs TriagePublic

Description

One good thing about having abstract schema is that you can spot issues much easier. One of the database conventions is to have table prefix for all columns and indexes (e.g. logging table has prefix of log_, all columns and index names should start with log_).

Given that 80% of tables are migrated to abstract schema, I quickly wrote this python script to check:

import json
with open('/var/lib/mediawiki/maintenance/tables.json', 'r') as f:
    tables = json.loads(f.read())
for table in tables:
    prefixes = []
    for column in table.get('columns', []):
        if 'name' in column:
            prefixes.append(column['name'].split('_')[0])
    for index in table.get('indexes', []):
        if 'name' in index:
            prefixes.append(index['name'].split('_')[0])
    if len(set(prefixes)) != 1:
        print(table['name'], prefixes)

And the result is these tables:

  • site_identifiers ['si', 'si', 'si', 'site', 'site']
  • user_properties ['up', 'up', 'up', 'user']
  • change_tag ['ct', 'ct', 'ct', 'ct', 'ct', 'ct', 'change', 'change', 'change', 'change']
  • sites ['site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'site', 'sites', 'sites', 'sites', 'sites', 'sites', 'sites', 'sites', 'sites']
  • user_newtalk ['user', 'user', 'user', 'un', 'un']
  • revision_actor_temp ['revactor', 'revactor', 'revactor', 'revactor', 'revactor', 'actor', 'page']
  • page ['page', 'page', 'page', ' page', 'page', 'page', 'page', 'page', 'page', 'page', 'page', 'page', 'page', 'name', 'page', 'page', 'page']
  • objectcache ['keyname', 'value', 'exptime', 'exptime']

We should fix these and then easily add a unit test to avoid new cases being introduced in future

Event Timeline

Reedy renamed this task from Fix and enforce table prefix usage in columns and idnexes in core to Fix and enforce table prefix usage in columns and indexes in core.Dec 13 2020, 3:48 PM

Change 649414 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Rename user_properties index

https://gerrit.wikimedia.org/r/649414

Change 649487 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Rename revision_actor_temp indexes

https://gerrit.wikimedia.org/r/649487

Change 649488 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Rename all sites indexes

https://gerrit.wikimedia.org/r/649488

Change 649414 merged by jenkins-bot:
[mediawiki/core@master] Rename user_properties index

https://gerrit.wikimedia.org/r/649414

Change 649488 merged by jenkins-bot:
[mediawiki/core@master] Rename all sites indexes

https://gerrit.wikimedia.org/r/649488

Change 648817 had a related patch set uploaded (by Ammarpad; owner: Ladsgroup):
[mediawiki/core@master] Rename four logging indexes to have log_ prefix

https://gerrit.wikimedia.org/r/648817

Change 648817 merged by jenkins-bot:
[mediawiki/core@master] Rename four logging indexes to have log_ prefix

https://gerrit.wikimedia.org/r/648817

Change 651160 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Rename change_tag indexes to have ct_ prefix

https://gerrit.wikimedia.org/r/651160

I wonder whether for user_newtalk, we should rename the columns instead to have the un_ prefix. Prefixing the indexes with user_ will give possibility of collision with the user table indexes in rdbms where the indexes are global. user table of course has much greater claim to this prefix.

Yeah, it's complicated, user prefix should be for user table. OTOH, renaming columns is a lot of work (migration of data, places it's used in core, etc.) but not impossible. My suggestion would be to add it to list of exceptions and create a ticket for the long term work (and link it in the exception list). How does that sound?

Looks OK. That would be better.

Change 651176 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Rename site_identifiers indexes to have si_ prefix

https://gerrit.wikimedia.org/r/651176

Change 653086 had a related patch set uploaded (by Umherirrender; owner: Umherirrender):
[mediawiki/extensions/ArticleFeedbackv5@master] Change name of log_page_time index to prepare for 1.36

https://gerrit.wikimedia.org/r/653086

Change 653086 merged by jenkins-bot:
[mediawiki/extensions/ArticleFeedbackv5@master] Remove index hint for renamed page_time

https://gerrit.wikimedia.org/r/653086

Change 653432 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Add test to assert uniform table prefix usage in abstract schema

https://gerrit.wikimedia.org/r/653432

Change 653432 merged by jenkins-bot:
[mediawiki/core@master] Add test to assert uniform table prefix usage in abstract schema

https://gerrit.wikimedia.org/r/653432

Change 651176 merged by jenkins-bot:
[mediawiki/core@master] Rename site_identifiers indexes to have si_ prefix

https://gerrit.wikimedia.org/r/651176

Change 660450 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/extensions/FlaggedRevs@master] Check use of 'change_tag' index that is being renamed

https://gerrit.wikimedia.org/r/660450

Change 660450 merged by jenkins-bot:
[mediawiki/extensions/FlaggedRevs@master] Check use of 'change_tag' index that is being renamed

https://gerrit.wikimedia.org/r/660450

Change 667358 had a related patch set uploaded (by Ladsgroup; owner: Ladsgroup):
[mediawiki/core@master] Rename new_name_timestamp on recentchanges to rc_new_name_timestamp

https://gerrit.wikimedia.org/r/667358

Change 667358 merged by jenkins-bot:
[mediawiki/core@master] Rename new_name_timestamp on recentchanges to rc_new_name_timestamp

https://gerrit.wikimedia.org/r/667358

Change 673832 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/extensions/FlaggedRevs@master] Check page_timestamp index forcing to prepare for renaming

https://gerrit.wikimedia.org/r/673832

Change 673833 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/extensions/AbuseFilter@master] Check forcing of page_timestamp revision index

https://gerrit.wikimedia.org/r/673833

Change 673995 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Prepare for renaming page_timestamp revision index

https://gerrit.wikimedia.org/r/673995

Ammarpad updated the task description. (Show Details)

Change 674138 had a related patch set uploaded (by Ammarpad; owner: Ammarpad):
[mediawiki/core@master] Rename page_timestamp revision index

https://gerrit.wikimedia.org/r/674138

Change 674813 had a related patch set uploaded (by Ammarpad; author: Ammarpad):
[mediawiki/extensions/FlaggedRevs@master] Check name_title index forcing to prepare for renaming

https://gerrit.wikimedia.org/r/674813

Change 674817 had a related patch set uploaded (by Ammarpad; author: Ammarpad):
[mediawiki/core@master] Check forcing of 'name_title' index to prepare for renaming

https://gerrit.wikimedia.org/r/674817

Change 674818 had a related patch set uploaded (by Ammarpad; author: Ammarpad):
[mediawiki/core@master] Rename name_title index to have page_ prefix

https://gerrit.wikimedia.org/r/674818