Why
- reduce database lock contention, which would reduce site outages and performance issues
- requested by @Ladsgroup in T365303#11119098
The next bottleneck here is actually update user_editcount which has the same problem (deferred update, on the same row). I want to eventually move that column out of user table which would also to shard them, e.g. each user would have ten rows being picked at random so it they wouldn't compete on lock for same row. If anyone is feeling like doing it, that'll help a lot in both commons and wikidata.
What
- split user.user_editcount into a new table called [TBD]
- the columns will be [TBD]
- users with zero edits should have zero rows in the table. no row = assume the user has 0 edits
- this table should use sharding. this reduces lock contention.
- there should be up to 10 rows per user
- a row should be picked randomly for UPDATEs
- the rows should be SUM()med to get the total count
- the algorithm for determining how many rows a user should have is [TBD]. the code should check if it needs to insert more rows when [TBD]
- example:
- the migration plan is [TBD]. for example, do we eventually want to delete user_editcount entirely? if so, would want to write a maintenance script to seed the new table. if not, can populate the new table as new edits come in, but would need to include code for that in the patch
- the sharding should behind a feature flag called $wg[TBD] (the site_stats one is called $wgMultiShardSiteStats)