The api_feature_usage table for the ApiFeatureUsage extension was proposed in https://gerrit.wikimedia.org/r/1020385 as part of T313731. This is a single global table that covers all wikis. Usage of this table is behind a feature flag (changing $wgApiFeatureUsageQueryEngineConf). The default is to keep using the cirrus search engine.
Should this table be replicated to wiki replicas (does it not contain private data)?
No, in order to reduce risk of needing to suppress entries due requests with "User-Agent: <doxxing info>". The public interfaces (api, special page) require specifying an agent when enumerating entries. However, matching the current elastic engine behavior, it is only a *prefix* search...that should probably be changed to exact match, to avoid having a prefix becoming a "distribution channel". That might be a slightly annoying to people maintaining multiple bots when checking if any use deprecated features.
Will you be doing cross-joins with the wiki metadata?
No
Size of the table (number of rows expected).
~6 million, assuming 62,000 rows for each day of retention and a 90 day retention period
Expected growth per year (number of rows).
Little continuous growth is expected. However, the size can increase or decrease depending on how how many api features are deprecated at once and how popular they are.
Expected amount of queries, both writes and reads (per minute, per hour...per day, any of those are ok).
Reads will be negligible. Writes should be less than 27 queries/sec, assuming the increments over a day (e.g. 27 million) spread out over the (agent,feature) counters (e.g. 64000). This is rounding up 421 counters hits/day per agent to 500. In reality, many of those tuples will only get the initial increment/init (e.g. 32K out of 64K), meaning that adaptive sampling would reduce the write rate further.
Going into a bit more detail: for each (agent,feature) counter, it takes ~25 writes for the first 500 hits given the sampling approach. The first 5K hits take ~50 writes, and the first 50K take ~100 writes.
Examples of queries that will be using the table.
SELECT afu_date FROM `api_feature_usage` WHERE (afu_date >= '1') ORDER BY afu_date ASC LIMIT 1
SELECT afu_date,afu_feature,SUM(afu_hits) AS `hits` FROM `api_feature_usage` WHERE (afu_agent LIKE 'testing-bot%' ESCAPE '`') AND (afu_date >= '1') GROUP BY afu_date,afu_feature ORDER BY afu_date ASC,afu_feature ASC
The release plan for the feature (are there specific wikis you'd like to test first etc).
Deploy to (testwiki, test2wiki, mediawikiwiki) first. Observe table rows and logs. Use ab to verify that spikes of API queries, from one agent, using a deprecated feature, do not affect mariadb com_xxx graphs. Deploy to commonswiki after a few days, wikidatawiki a week later, then everywhere a week later.