Page MenuHomePhabricator

Move the wbc_entity_usage table onto a dedicated DB shard
Open, LowPublic

Description

Per T173189: Investigate separating wbc_entity_usage out to a separate mariadb shard and the various discussion we had at Wikimania, we want to move the "wbc_entity_usage" table to a separate DB shard.

It will still be possible (and the default) to have this table together with the other wiki tables for smaller installations.

Event Timeline

@hoo Regarding Wikimedia setup, you must know that it is our priority right now to move wikidata to a dedicated server group; which means from ops side no other structural change can happen at the same time.

This is still needed (and doing it is blocked on code being written allowing it), but I want to set expectations clear in terms of timeline for the physical movement of tables away. I calculate at least 2 quarters to having the time, 3 if it depends on purchases (that were not budgeted this year). Code changes, however, -if it defaults to the same database- has no blockers on us and can start ASAP.

@jcrespo please note that the proposed change is for all wikibase client wikis (that is, almost all wikis). Wikidata itself also has wbc_entity_usage since it is a client of itself, but the proposed change is much more relevant for large client wikis, such as enwiki and commonswiki.

I know- it is only related because the wikidata migration require replication channels movement and that consumes DBA time, not because it contains wikidata.

As a note, even if the tables are not physically migrated away, if they are made independent, they can replicate on a different domain id- that will not be as advantageous as physically move them away, but short term it can allow better parallel replication within every master/s* group until we can do the topology changes.

Revisiting this after some time,
Since we implemented usage deduplicator to squeeze too many modifiers into one general one and several other tuning to make fine-grained usage tracking more robust, it's hard for me to see the gains we can get from moving to a dedicated shard at the moment. This might come handy when data usage becomes too big in the movement but that level of engage hasn't happened yet. Moving to a dedicated shard translates to lots of resource needed.

What I propose here is to wait until all of tunings gets propagated through all cache layers for all pages which hopefully will happen in the next week. Then take a look at storage of the tables and decide if it's going to become problematic or not. One note to consider is that we are planning to shrink logging table in all wikis. Specially for wikidata and commons that it frees up 99% of the table storage for them (for Wikidata, it's 200GB). That might gives us some storage for more useful data (like wbc_entity_usage)

Addshore triaged this task as Low priority.EditedNov 6 2018, 12:53 PM
Addshore subscribed.

This is low from our side.
Unless DBA s tell us that we have to do this.