We have added a new table to track the status of translatable pages. This is working on Translatewiki.net but the table is still not created or used on Wikimedia sites.
To enable this feature,
- Create the table
- Run the syncTranslatableBundleStatus script to populate the table with the current status of translatable pages.
Creating new tables on Wikimedia sites: https://wikitech.wikimedia.org/wiki/Creating_new_tables
Table information
Information regarding table creation can be found in this Google doc and you can review all the options that we considered on the task here T302729: Spike: Explore options for improving page listing database query for Special:PageTranslation.
Table structure
# Copied from https://github.com/wikimedia/mediawiki-extensions-Translate/blob/master/sql/mysql/translate_translatable_bundles.sql CREATE TABLE /*_*/translate_translatable_bundles ( ttb_page_id INT UNSIGNED NOT NULL, ttb_type SMALLINT UNSIGNED NOT NULL, ttb_status SMALLINT UNSIGNED NOT NULL, ttb_sortkey VARBINARY(255) NOT NULL, UNIQUE INDEX ttb_type_sortkey_status ( ttb_type, ttb_sortkey, ttb_status ), PRIMARY KEY(ttb_page_id) ) /*$wgDBTableOptions*/;
This table will contain the status for the latest revision of a translatable bundle.
Here’s a short description of each column:
- ttb_sortkey : A sort key created using Collation::getSortKey() with the page namespace and the page id.
- ttb_page_id : The page Id
- ttb_status : Different possible status for translatable bundles:
- tp:proposed - 1
- tp:active - 2
- tp:outdated - 3
- tp:broken - 4
- ttb_type: Type of translatable bundle. 1 for Translatable pages, 2 for Message bundles.
More information
Should this table be replicated to wiki replicas (does it not contain private data)?
Yes, we need to create the table on all the wikis where the Translate extension is enabled. It contains data about translatable pages and message bundles. It does not contain private data.
Will you be doing cross-joins with the wiki metadata?
We will be doing SQL JOIN with the page table and potentially other page related tables.
Size of the table (number of rows expected).
This table would store 1 row per translatable pages / message bundles. On Meta-Wiki, we currently have 10,000 translatable pages. I'd say we would be looking at around 50,000 rows at max.
Expected growth per year (number of rows).
I expect around 1,500-2,000 records at max.
Expected amount of queries, both writes and reads (per minute, per hour...per day, any of those are ok)
Approximate numbers:
- reads - 7000 per day
- writes - 350 per day
See: https://phabricator.wikimedia.org/T321929#9199246 to understand how we came up with these numbers
Examples of queries that will be using the table
# All translatable pages SELECT * FROM translate_translatable_bundles WHERE type = 'tp'; # All translatable pages with pagination SELECT * FROM translate_translatable_bundles WHERE type = 'tp' AND sortkey > 'x' LIMIT 200; # All outdated translated pages SELECT * FROM translate_translatable_bundles WHERE type = 'tp' AND status = 'outdated'; # All outdated, active, proposed etc translatable pages sorted, and with limit SELECT * FROM translate_translatable_bundles WHERE type = 'tp' AND status IN (...) AND sortkey > 'x' LIMIT 200; # Translatable pages with certain statuses, and with paging in a given namespace SELECT * FROM translate_translatable_bundles WHERE type = 'tp' AND status IN (...) AND sortkey > 'x' LEFT JOIN page ON (page_id = page_id AND page_namespace = N) LIMIT 200; # Translatable pages with a status and paging in a given namespace and where title matches a value SELECT * FROM translate_translatable_bundles WHERE type = 'tp' AND status IN (...) AND sortkey > 'x' LEFT JOIN page ON (page_id = page_id AND page_namespace = N AND page_title LIKE '%name%') LIMIT 200; # Translatable pages with paging in a given namespace and where title matches a value SELECT * FROM translate_translatable_bundles WHERE type = 'tp' AND sortkey > 'x' LEFT JOIN page ON (page_id = page_id AND page_namespace = N AND page_title LIKE '%name%') LIMIT 200; # Operating on a single page. INSERT/REPLACE (type, status, sortkey) = VALUES(...) where page_id = N;
The release plan for the feature (are there specific wikis you'd like to test first etc).
We'd like to do the release in 3 phases:
- Testwiki - Deploy and test on Testwiki, observe for few days.
- MediaWiki - Deploy and test on MediaWiki, observe for few days.
- Rollout to all other wikis using Translate extension. See list here: https://gerrit.wikimedia.org/g/operations/mediawiki-config/+/d7b7b75151b5980db8020c12963c64f68f76e3b8/wmf-config/InitialiseSettings.php#7089
Deployment notes
- Create the table on the respective wiki
- Run the extensions/Translate/scripts/syncTranslatableBundleStatus script to populate the table with data based on existing status of translatable pages.
Current status
We've started creating the tables on various wikis. You can track the progress here: https://phabricator.wikimedia.org/T321929#9333589