- In the databases for English Wikipedia, Wikimedia Commons, Wikidata, and Test Wikipedia (enwiki, commonswiki, wikidatawiki and testwiki), the site_stats table now has more than one row, and the correct number for each column is the sum of that column across all rows.
- Quarry queries that use this table should be updated to use the SUM() of a column; For an example, if you previously had:
- SELECT ss_total_edits FROM site_stats;
- change it to either:
- SELECT SUM(ss_total_edits) FROM site_stats;
- SELECT SUM(ss_total_edits) AS ss_total_edits FROM site_stats;.
- (This is also safe to do on all other wikis, where the table still has only one row.)
Original Task description:
site_stats table has quite a write pressure (you can see the numbers in performance_schema tables). This is mostly the case in commons/wikidata/enwiki for updating edit count. We used to have SiteStatsAsyncFactor config saving the values in memcached and then saving them from time to time and it got removed (for good reasons, it was never deployed and did a lot of complex magic for no good reason).
An alternative idea is to simply shard. Instead of one row, keep ten (configurable number) and update one row at random and in reading the table just sum the values up. The idea is inspired by DBSerialProvider done by @tstarling in gerrit:767617