This is the task for the schema change project documented on the wiki at User:Brion VIBBER/Compacting the revision table round 2. Part of the description is copied below.
Per ongoing discussion in ArchCom and at WikiDev17 about performance, future requirements, and future-proofing for table size it's proposed to do a major overhaul of the revision table, combining the following improvements:
- Normalization of frequently duplicated data to separate tables, reducing the dupe strings to integer keys
- Separation of content-specific from general-revision metadata to support:
- Multi-content revisions allowing for storing of multiple content blobs per revision -- not related to compaction, but of great interest for structured data additions planned for multimedia and articles
- general reduction in revision table width / on-disk size will make schema changes easier in future
- trying to avoid inconsistencies in live index deployments
- ideally all indexes should fit on all servers, making it easier to switch database backend around in production
The specific changes and associated Wikimedia production tasks involved here are:
- Dropping rev_comment, adding rev_comment_id. (T166733, T215466)
- Ready to go!
- Dropping rev_user and rev_user_text, adding rev_actor. (T188327, T215466)
- Ready to go!
- Dropping rev_text_id, rev_content_model, and rev_content_format. (T238958, T238966)
- Ready to go!
- Fixing the type of rev_timestamp on old wikis to match tables.sql. (T298560, P8433)
- Ready to go!