Page MenuHomePhabricator

Monitor the growth of CheckUser tables at enwiki and few other very large wikis
Closed, ResolvedPublic

Description

T253802 was deployed to all of our wikis earlier today (list of included wikis is at T253802#6536344). This will increase database writes to cu_changes a little bit.

I would like to ask the DBA team to monitor the database growth, to make sure the change won't cause an issue when deployed to the rest of the wikis.

The monitoring is to be done at the following wikis:

  • enwiki
  • wikidatawiki
  • commonswiki

Sizes are presented: "Compressed / Uncompressed"

wiki2020-11-032020-11-102020-11-172020-11-242020-12-012020-12-082020-12-152020-12-22
enwiki1.3G / 5.4G1.3G / 5.6G1.3G / 5.7G1.3G / 5.7G1.4G/ 6G1.3G / 5.7G1.5G / 6.5G1.6G / 6.7G
commonswiki3.6G / 26.7G3.7G / 26.8G3.6G / 26.5G3.5G / 26.1G3.5G / 25.5G3.3G / 23.9G1.8G / 21.4G2.7G / 19.2G
wikidatawiki1.93G / 16.6G1.7G / 15.1G1.7G / 14.4G1.7G / 14.1G1.7G / 14G1.6G / 13.7G1.6G / 13.5G1.7G / 13.7G

Event Timeline

Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to In progress on the DBA board.
Marostegui updated the task description. (Show Details)

I want to monitor enwiki size two more weeks, as there was a big increase from one week to another. Let's see if that's a trend

I want to monitor enwiki size two more weeks, as there was a big increase from one week to another. Let's see if that's a trend

Ping: 2020-12-08 is missing in the table above.

I want to monitor enwiki size two more weeks, as there was a big increase from one week to another. Let's see if that's a trend

Ping: 2020-12-08 is missing in the table above.

I was on holidays. I will add it retroactively

Going to add another week for checking enwiki, as it grew a bit too much over past week

So enwiki seems a bit more stable in the last week, but it's grown around 1.3G in 49 days. That means at that rate we are looking at around 9.1G per year.
The current size of the table is 7GB, so it is a considerable growth for just one year but still not too bad given that enwiki is one of the busiest

Waiting for the s4 backup to be finished to complete the stats

To confirm T253802#6185761 means that we won't be storing things past 90 days, right? @Urbanecm do you happen to know if those entries would get purged once they've reached those 90 days?

To confirm T253802#6185761 means that we won't be storing things past 90 days, right? @Urbanecm do you happen to know if those entries would get purged once they've reached those 90 days?

Yes, the rows will be purged via https://gerrit.wikimedia.org/g/operations/puppet/+/44b7de4fd6645d81f093624f287d76729267ded5/modules/profile/manifests/mediawiki/maintenance/purge_checkuser.pp (the MediaWiki script is at https://github.com/wikimedia/mediawiki-extensions-CheckUser/blob/master/maintenance/purgeOldData.php, if interested).

Let me know if you have more questions :).

Excellent, thank you for the fast response

Marostegui updated the task description. (Show Details)

Closing this as everything looks stable