Page MenuHomePhabricator

x1 increase in writes results in a large increase of binlog files (over 2000)
Closed, ResolvedPublic3 Estimated Story Points

Description

While doing a cloning between x1 hosts, I noticed that x1 is 3.1TB in size.
Investigating a bit I noticed we have more than 2000 binlog files (1GB per log) - we keep one month of binlogs.
This means we have over 2TB just of logs.

Taking a look at the current codfw master, over the last 90 day I've not seen any particular increase in writes that could justify this.
We should probably take a look inside the binlogs and try to find if there's something being written more than it should.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Marostegui renamed this task from x1 increase in writes to x1 increase in writes results in a large increase of binlog files (over 2000).Jan 8 2026, 12:23 PM
Marostegui moved this task from Triage to Ready on the DBA board.
Ladsgroup added subscribers: Michael, Urbanecm_WMF.

It's GE writing giangantic blobs:

### UPDATE `cswiki`.`growthexperiments_user_impact`
### WHERE
###   @1=40668 /* INT meta=0 nullable=0 is_null=0 */
###   @2='20260107161804' /* STRING(14) meta=65038 nullable=0 is_null=0 */
###   @3='(somewhat large blob of binary)' /* MEDIUMBLOB/MEDIUMTEXT meta=3 nullable=0 is_null=0 */
### SET
###   @1=40668 /* INT meta=0 nullable=0 is_null=0 */
###   @2='20260107161854' /* STRING(14) meta=65038 nullable=0 is_null=0 */
###   @3='(enormous gigantic blobs)'

I've looked at some large blobs. Thankfully they are not as large as the binlogs (maybe some bug got fixed?) anyway. One simple issue I found is that dailyArticleViews has a lot of zeros:

"dailyArticleViews":{"<redacted>":{"firstEditDate":"2025-12-31","newestEdit":"202512311xxxx","views":{"2025-12-29":0,"2025-12-30":0,"2025-12-31":8072,"2026-01-01":5485,"2026-01-02":4811,"2026-01-03":3253,"2026-01-04":3368,"2026-01-05":2770,"2026-01-06":2046,"2026-01-07":1758},"viewsCount":31563},"<redacted>":{"imageUrl":"<redacted>","firstEditDate":"2026-01-02","newestEdit":"20260102105xxxx","views":{"2025-11-10":0,"2025-11-11":0,"2025-11-12":0,"2025-11-13":0,"2025-11-14":0,"2025-11-15":0,"2025-11-16":0,"2025-11-17":0,"2025-11-18":0,"2025-11-19":0,"2025-11-20":0,"2025-11-21":0,"2025-11-22":0,"2025-11-23":0,"2025-11-24":0,"2025-11-25":0,"2025-11-26":0,"2025-11-27":0,"2025-11-28":0,"2025-11-29":0,"2025-11-30":0,"2025-12-01":0,"2025-12-02":0,"2025-12-03":0,"2025-12-04":0,"2025-12-05":0,"2025-12-06":0,"2025-12-07":0,"2025-12-08":0,"2025-12-09":0,"2025-12-10":0,"2025-12-11":0,"2025-12-12":0,"2025-12-13":0,"2025-12-14":0,"2025-12-15":0,"2025-12-16":0,"2025-12-17":0,"2025-12-18":0,"2025-12-19":0,"2025-12-20":0,"2025-12-21":0,"2025-12-22":0,"2025-12-23":0,"2025-12-24":0,"2025-12-25":0,"2025-12-26":0,"2025-12-27":0,"2025-12-28":0,"2025-12-29":0,"2025-12-30":0,"2025-12-31":0,"2026-01-01":0,"2026-01-02":938,"2026-01-03":711,"2026-01-04":896,"2026-01-05":916,"2026-01-06":802,"2026-01-07":1000},"viewsCount":5263}

I've looked at some large blobs. Thankfully they are not as large as the binlogs (maybe some bug got fixed?) anyway. One simple issue I found is that dailyArticleViews has a lot of zeros:

"dailyArticleViews":{"<redacted>":{"firstEditDate":"2025-12-31","newestEdit":"202512311xxxx","views":{"2025-12-29":0,"2025-12-30":0,"2025-12-31":8072,"2026-01-01":5485,"2026-01-02":4811,"2026-01-03":3253,"2026-01-04":3368,"2026-01-05":2770,"2026-01-06":2046,"2026-01-07":1758},"viewsCount":31563},"<redacted>":{"imageUrl":"<redacted>","firstEditDate":"2026-01-02","newestEdit":"20260102105xxxx","views":{"2025-11-10":0,"2025-11-11":0,"2025-11-12":0,"2025-11-13":0,"2025-11-14":0,"2025-11-15":0,"2025-11-16":0,"2025-11-17":0,"2025-11-18":0,"2025-11-19":0,"2025-11-20":0,"2025-11-21":0,"2025-11-22":0,"2025-11-23":0,"2025-11-24":0,"2025-11-25":0,"2025-11-26":0,"2025-11-27":0,"2025-11-28":0,"2025-11-29":0,"2025-11-30":0,"2025-12-01":0,"2025-12-02":0,"2025-12-03":0,"2025-12-04":0,"2025-12-05":0,"2025-12-06":0,"2025-12-07":0,"2025-12-08":0,"2025-12-09":0,"2025-12-10":0,"2025-12-11":0,"2025-12-12":0,"2025-12-13":0,"2025-12-14":0,"2025-12-15":0,"2025-12-16":0,"2025-12-17":0,"2025-12-18":0,"2025-12-19":0,"2025-12-20":0,"2025-12-21":0,"2025-12-22":0,"2025-12-23":0,"2025-12-24":0,"2025-12-25":0,"2025-12-26":0,"2025-12-27":0,"2025-12-28":0,"2025-12-29":0,"2025-12-30":0,"2025-12-31":0,"2026-01-01":0,"2026-01-02":938,"2026-01-03":711,"2026-01-04":896,"2026-01-05":916,"2026-01-06":802,"2026-01-07":1000},"viewsCount":5263}

Thanks for investigating! Based on this findings I'd say a trivial optimization is to stop storing 0 pageviews days in DB and make the client fill up the 0s when rendering the pageviews charts. What do you think @Michael @Urbanecm_WMF ?

I've looked at some large blobs. Thankfully they are not as large as the binlogs (maybe some bug got fixed?) anyway. One simple issue I found is that dailyArticleViews has a lot of zeros:

"dailyArticleViews":{"<redacted>":{"firstEditDate":"2025-12-31","newestEdit":"202512311xxxx","views":{"2025-12-29":0,"2025-12-30":0,"2025-12-31":8072,"2026-01-01":5485,"2026-01-02":4811,"2026-01-03":3253,"2026-01-04":3368,"2026-01-05":2770,"2026-01-06":2046,"2026-01-07":1758},"viewsCount":31563},"<redacted>":{"imageUrl":"<redacted>","firstEditDate":"2026-01-02","newestEdit":"20260102105xxxx","views":{"2025-11-10":0,"2025-11-11":0,"2025-11-12":0,"2025-11-13":0,"2025-11-14":0,"2025-11-15":0,"2025-11-16":0,"2025-11-17":0,"2025-11-18":0,"2025-11-19":0,"2025-11-20":0,"2025-11-21":0,"2025-11-22":0,"2025-11-23":0,"2025-11-24":0,"2025-11-25":0,"2025-11-26":0,"2025-11-27":0,"2025-11-28":0,"2025-11-29":0,"2025-11-30":0,"2025-12-01":0,"2025-12-02":0,"2025-12-03":0,"2025-12-04":0,"2025-12-05":0,"2025-12-06":0,"2025-12-07":0,"2025-12-08":0,"2025-12-09":0,"2025-12-10":0,"2025-12-11":0,"2025-12-12":0,"2025-12-13":0,"2025-12-14":0,"2025-12-15":0,"2025-12-16":0,"2025-12-17":0,"2025-12-18":0,"2025-12-19":0,"2025-12-20":0,"2025-12-21":0,"2025-12-22":0,"2025-12-23":0,"2025-12-24":0,"2025-12-25":0,"2025-12-26":0,"2025-12-27":0,"2025-12-28":0,"2025-12-29":0,"2025-12-30":0,"2025-12-31":0,"2026-01-01":0,"2026-01-02":938,"2026-01-03":711,"2026-01-04":896,"2026-01-05":916,"2026-01-06":802,"2026-01-07":1000},"viewsCount":5263}

Thanks for investigating! Based on this findings I'd say a trivial optimization is to stop storing 0 pageviews days in DB and make the client fill up the 0s when rendering the pageviews charts. What do you think @Michael @Urbanecm_WMF ?

Sounds good to me.
But what did we do for this to start now? @Ladsgroup do you have a way to figure out when these large blobs started to be written? If not, then that's not a problem, but it would be helpful.

(I'm moving this to Up Next because we skipped this in estimation with the understanding that we would pick it up anyway due to it being a database-problem that we're causing.)

I've looked at some large blobs. Thankfully they are not as large as the binlogs (maybe some bug got fixed?) anyway. One simple issue I found is that dailyArticleViews has a lot of zeros:

"dailyArticleViews":{"<redacted>":{"firstEditDate":"2025-12-31","newestEdit":"202512311xxxx","views":{"2025-12-29":0,"2025-12-30":0,"2025-12-31":8072,"2026-01-01":5485,"2026-01-02":4811,"2026-01-03":3253,"2026-01-04":3368,"2026-01-05":2770,"2026-01-06":2046,"2026-01-07":1758},"viewsCount":31563},"<redacted>":{"imageUrl":"<redacted>","firstEditDate":"2026-01-02","newestEdit":"20260102105xxxx","views":{"2025-11-10":0,"2025-11-11":0,"2025-11-12":0,"2025-11-13":0,"2025-11-14":0,"2025-11-15":0,"2025-11-16":0,"2025-11-17":0,"2025-11-18":0,"2025-11-19":0,"2025-11-20":0,"2025-11-21":0,"2025-11-22":0,"2025-11-23":0,"2025-11-24":0,"2025-11-25":0,"2025-11-26":0,"2025-11-27":0,"2025-11-28":0,"2025-11-29":0,"2025-11-30":0,"2025-12-01":0,"2025-12-02":0,"2025-12-03":0,"2025-12-04":0,"2025-12-05":0,"2025-12-06":0,"2025-12-07":0,"2025-12-08":0,"2025-12-09":0,"2025-12-10":0,"2025-12-11":0,"2025-12-12":0,"2025-12-13":0,"2025-12-14":0,"2025-12-15":0,"2025-12-16":0,"2025-12-17":0,"2025-12-18":0,"2025-12-19":0,"2025-12-20":0,"2025-12-21":0,"2025-12-22":0,"2025-12-23":0,"2025-12-24":0,"2025-12-25":0,"2025-12-26":0,"2025-12-27":0,"2025-12-28":0,"2025-12-29":0,"2025-12-30":0,"2025-12-31":0,"2026-01-01":0,"2026-01-02":938,"2026-01-03":711,"2026-01-04":896,"2026-01-05":916,"2026-01-06":802,"2026-01-07":1000},"viewsCount":5263}

Thanks for investigating! Based on this findings I'd say a trivial optimization is to stop storing 0 pageviews days in DB and make the client fill up the 0s when rendering the pageviews charts. What do you think @Michael @Urbanecm_WMF ?

Sounds good to me.

Thank you!

But what did we do for this to start now? @Ladsgroup do you have a way to figure out when these large blobs started to be written? If not, then that's not a problem, but it would be helpful.

I don't have a way to check it, one thing I'm sure of it's that it started at least a month ago since the number of binlog files per day hasn't changed across past 30 days. I can't say what happened older than that since we don't keep binlogs further than 30 days :(

I've looked at some large blobs. Thankfully they are not as large as the binlogs (maybe some bug got fixed?) anyway. One simple issue I found is that dailyArticleViews has a lot of zeros:

"dailyArticleViews":{"<redacted>":{"firstEditDate":"2025-12-31","newestEdit":"202512311xxxx","views":{"2025-12-29":0,"2025-12-30":0,"2025-12-31":8072,"2026-01-01":5485,"2026-01-02":4811,"2026-01-03":3253,"2026-01-04":3368,"2026-01-05":2770,"2026-01-06":2046,"2026-01-07":1758},"viewsCount":31563},"<redacted>":{"imageUrl":"<redacted>","firstEditDate":"2026-01-02","newestEdit":"20260102105xxxx","views":{"2025-11-10":0,"2025-11-11":0,"2025-11-12":0,"2025-11-13":0,"2025-11-14":0,"2025-11-15":0,"2025-11-16":0,"2025-11-17":0,"2025-11-18":0,"2025-11-19":0,"2025-11-20":0,"2025-11-21":0,"2025-11-22":0,"2025-11-23":0,"2025-11-24":0,"2025-11-25":0,"2025-11-26":0,"2025-11-27":0,"2025-11-28":0,"2025-11-29":0,"2025-11-30":0,"2025-12-01":0,"2025-12-02":0,"2025-12-03":0,"2025-12-04":0,"2025-12-05":0,"2025-12-06":0,"2025-12-07":0,"2025-12-08":0,"2025-12-09":0,"2025-12-10":0,"2025-12-11":0,"2025-12-12":0,"2025-12-13":0,"2025-12-14":0,"2025-12-15":0,"2025-12-16":0,"2025-12-17":0,"2025-12-18":0,"2025-12-19":0,"2025-12-20":0,"2025-12-21":0,"2025-12-22":0,"2025-12-23":0,"2025-12-24":0,"2025-12-25":0,"2025-12-26":0,"2025-12-27":0,"2025-12-28":0,"2025-12-29":0,"2025-12-30":0,"2025-12-31":0,"2026-01-01":0,"2026-01-02":938,"2026-01-03":711,"2026-01-04":896,"2026-01-05":916,"2026-01-06":802,"2026-01-07":1000},"viewsCount":5263}

Thanks for investigating! Based on this findings I'd say a trivial optimization is to stop storing 0 pageviews days in DB and make the client fill up the 0s when rendering the pageviews charts. What do you think @Michael @Urbanecm_WMF ?

Is there any ETA to get this shipped?
Thanks!

Mh, that would point to GrowthExperimentsUserImpactUpdater: Support temporary and non-special-homepage accounts as the likely culprit, merged on Monday the 16th of June. This would likely not have increased the size of an individual user-impact blob, but it likely massively increased the number of user impact blob writes due to writing them for all those temporary accounts. (cc @kostajh)

And while the table itself should remain relatively small (we only cache data there for 2 days or so), if the binlogs retain all the data written (and deleted?), then it would be unsurprising to see them spiral until they reach the time when they themselves are being pruned.

Fixing those 0-pageviews days to reduce the size of the individual user-impact blobs sounds like a good strategy, but we should keep in mind that it is applying a different optimization than what caused the current issue.

We can also switch x1 to SBR instead of RBR for the time being. But I'd like to see a permanent solution so we don't have to mitigate this from the server side.

We can also switch x1 to SBR instead of RBR for the time being. But I'd like to see a permanent solution so we don't have to mitigate this from the server side.

I've been chatting with @Ladsgroup about this and it won't indeed help much, so we really need the fix to be shipped from the code point of view.

Mh, that would point to GrowthExperimentsUserImpactUpdater: Support temporary and non-special-homepage accounts as the likely culprit, merged on Monday the 16th of June. This would likely not have increased the size of an individual user-impact blob, but it likely massively increased the number of user impact blob writes due to writing them for all those temporary accounts. (cc @kostajh)

Would the rows get cleaned up after the temp account get expired? They should. It won't help much with binlog issue but avoid another issue at least.

Please prioritize mitigating this. Disk write on x1 on average is now 20 times as disk writes of s1. This is quite high risk and can easily cause large-scale issues if combined with other issues (loss of redundancy, spike of write in other areas, etc.)

Please prioritize mitigating this. Disk write on x1 on average is now 20 times as disk writes of s1. This is quite high risk and can easily cause large-scale issues if combined with other issues (loss of redundancy, spike of write in other areas, etc.)

And an example of this is the fact that I have to reclone an x1 host today and it is going to take 3x times than it should, which means we are 2 hosts down in eqiad for a lot longer than we should.

Mh, that would point to GrowthExperimentsUserImpactUpdater: Support temporary and non-special-homepage accounts as the likely culprit, merged on Monday the 16th of June. This would likely not have increased the size of an individual user-impact blob, but it likely massively increased the number of user impact blob writes due to writing them for all those temporary accounts. (cc @kostajh)

Would the rows get cleaned up after the temp account get expired? They should. It won't help much with binlog issue but avoid another issue at least.

That is probably a separate task.

Fixing those 0-pageviews days to reduce the size of the individual user-impact blobs sounds like a good strategy, but we should keep in mind that it is applying a different optimization than what caused the current issue.

I assume that Growth team is going to update dailyArticleViews as the solution, instead of changing the parameters around which accounts have their impact data stored?

I assume that Growth team is going to update dailyArticleViews as the solution, instead of changing the parameters around which accounts have their impact data stored?

I was looking after updating dailyArticleViews and filtering 0s out as a mitigation rather than a solution when I found we already had implemented 0-filtering for page views in T351898 as a mitigation for binlog size increases. I tried to identify were these 0s are coming from and failed to it for now but will keep investigating. Even if it would be interesting to identify that, I'm not sure this micro optimization ensures direct and impactful binlog size decrease. As @Michael mentioned the problem started when the number of writes increased rather than when these 0s started appearing (which we don't really know when yet) or due to query payload sizes, at least not yet proved.

I was looking at a more aggressive optimization that would consist on reducing the 60 days data per article aka data points to only 6, which is the least we need to print a meaningful chart. That's the only usage GE makes of this data that I'm aware. However there's a REST API endpoint that's been out for some time and mobile apps are consuming it, so I'm on the process of investigating how much of a breaking change is the optimization. Any feedback for alternative mitigations is welcome. I have a local WIP patch with the 60 to 6 page views data points change that I need to refine but could push by tomorrow midday UTC.

A quite possible scenario is that a bug that got deployed during that time inadvertently brought back the 0s in the module. I also wouldn't really call it micro-optimization as checking one example, it cut the size of the module to half.

Change #1235818 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] DatabaseUserImpactStore: log attempts to save zero pageviews values

https://gerrit.wikimedia.org/r/1235818

Change #1236270 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] test(DatabaseUserImpactStoreTest): use realistic dailyArticleViews structure

https://gerrit.wikimedia.org/r/1236270

Change #1236270 abandoned by Sergio Gimeno:

[mediawiki/extensions/GrowthExperiments@master] test(DatabaseUserImpactStoreTest): use realistic dailyArticleViews structure

Reason:

unnecessary

https://gerrit.wikimedia.org/r/1236270

@Sgs I found the bug: ExpensiveUserImpact::filterViewCounts() filters based on array of ['key' => value] e.g. dates. That works in dailyTotalViews key which is fine but it skips all zeros in dailyArticleViews because it tries to remove zeros from array of ['key' => array] which won't work. I try to make a patch plus regression testing ASAP.

Change #1236352 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[mediawiki/extensions/GrowthExperiments@master] UserImpact: Remove zeros in per-article view stats

https://gerrit.wikimedia.org/r/1236352

@Sgs I found the bug: ExpensiveUserImpact::filterViewCounts() filters based on array of ['key' => value] e.g. dates. That works in dailyTotalViews key which is fine but it skips all zeros in dailyArticleViews because it tries to remove zeros from array of ['key' => array] which won't work. I try to make a patch plus regression testing ASAP.

I've approved the change. Even if it fixes the issue, something is still wrong here. We have apparently two place where we try to do the same thing: remove zeros from the pageview data. We should keep this task open until we figure that out.

Change #1236352 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] UserImpact: Remove zeros in per-article view stats

https://gerrit.wikimedia.org/r/1236352

Change #1236387 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.14] UserImpact: Remove zeros in per-article view stats

https://gerrit.wikimedia.org/r/1236387

Change #1236388 had a related patch set uploaded (by Ladsgroup; author: Ladsgroup):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.13] UserImpact: Remove zeros in per-article view stats

https://gerrit.wikimedia.org/r/1236388

Change #1236387 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.14] UserImpact: Remove zeros in per-article view stats

https://gerrit.wikimedia.org/r/1236387

Change #1236388 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.13] UserImpact: Remove zeros in per-article view stats

https://gerrit.wikimedia.org/r/1236388

Mentioned in SAL (#wikimedia-operations) [2026-02-04T01:25:32Z] <ladsgroup@deploy2002> Started scap sync-world: Backport for [[gerrit:1236387|UserImpact: Remove zeros in per-article view stats (T414080)]], [[gerrit:1236388|UserImpact: Remove zeros in per-article view stats (T414080)]]

Mentioned in SAL (#wikimedia-operations) [2026-02-04T01:29:37Z] <ladsgroup@deploy2002> ladsgroup: Backport for [[gerrit:1236387|UserImpact: Remove zeros in per-article view stats (T414080)]], [[gerrit:1236388|UserImpact: Remove zeros in per-article view stats (T414080)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-02-04T01:36:10Z] <ladsgroup@deploy2002> Finished scap sync-world: Backport for [[gerrit:1236387|UserImpact: Remove zeros in per-article view stats (T414080)]], [[gerrit:1236388|UserImpact: Remove zeros in per-article view stats (T414080)]] (duration: 10m 38s)

Thanks for the fix @Ladsgroup. Let's see how much impact does it have. I think it would still be interesting to log 0s generated by ComputedUserImpactLookup as the filterViewCounts is already a safeguard for something that should not happen.

Change #1235818 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] DatabaseUserImpactStore: log attempts to save zero pageviews values

https://gerrit.wikimedia.org/r/1235818

Change #1236748 had a related patch set uploaded (by Urbanecm; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.14] DatabaseUserImpactStore: log attempts to save zero pageviews values

https://gerrit.wikimedia.org/r/1236748

Change #1236749 had a related patch set uploaded (by Urbanecm; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.13] DatabaseUserImpactStore: log attempts to save zero pageviews values

https://gerrit.wikimedia.org/r/1236749

Change #1236749 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.13] DatabaseUserImpactStore: log attempts to save zero pageviews values

https://gerrit.wikimedia.org/r/1236749

Change #1236748 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.14] DatabaseUserImpactStore: log attempts to save zero pageviews values

https://gerrit.wikimedia.org/r/1236748

Mentioned in SAL (#wikimedia-operations) [2026-02-04T14:16:13Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1236739|Fix audio transcodes]], [[gerrit:1236749|DatabaseUserImpactStore: log attempts to save zero pageviews values (T414080)]], [[gerrit:1236748|DatabaseUserImpactStore: log attempts to save zero pageviews values (T414080)]], [[gerrit:1236690|IPReputationIPoidDataLookup: Allow returning stale values for 72 hours (T416316)]], [[gerrit:1236689|IPReput

Mentioned in SAL (#wikimedia-operations) [2026-02-04T14:18:26Z] <urbanecm@deploy2002> hartman, kharlan, urbanecm: Backport for [[gerrit:1236739|Fix audio transcodes]], [[gerrit:1236749|DatabaseUserImpactStore: log attempts to save zero pageviews values (T414080)]], [[gerrit:1236748|DatabaseUserImpactStore: log attempts to save zero pageviews values (T414080)]], [[gerrit:1236690|IPReputationIPoidDataLookup: Allow returning stale values for 72 hours (T416316)]], [[gerrit:1236689|IPRe

Mentioned in SAL (#wikimedia-operations) [2026-02-04T14:24:31Z] <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1236739|Fix audio transcodes]], [[gerrit:1236749|DatabaseUserImpactStore: log attempts to save zero pageviews values (T414080)]], [[gerrit:1236748|DatabaseUserImpactStore: log attempts to save zero pageviews values (T414080)]], [[gerrit:1236690|IPReputationIPoidDataLookup: Allow returning stale values for 72 hours (T416316)]], [[gerrit:1236689|IPRepu

Change #1236761 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.14] Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values"

https://gerrit.wikimedia.org/r/1236761

Change #1236762 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.13] Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values"

https://gerrit.wikimedia.org/r/1236762

Change #1236761 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.14] Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values"

https://gerrit.wikimedia.org/r/1236761

Change #1236762 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@wmf/1.46.0-wmf.13] Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values"

https://gerrit.wikimedia.org/r/1236762

Mentioned in SAL (#wikimedia-operations) [2026-02-04T15:10:20Z] <urbanecm@deploy2002> Started scap sync-world: Backport for [[gerrit:1236761|Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values" (T414080)]], [[gerrit:1236762|Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values" (T414080)]]

Mentioned in SAL (#wikimedia-operations) [2026-02-04T15:12:25Z] <urbanecm@deploy2002> urbanecm: Backport for [[gerrit:1236761|Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values" (T414080)]], [[gerrit:1236762|Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values" (T414080)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2026-02-04T15:26:40Z] <urbanecm@deploy2002> Finished scap sync-world: Backport for [[gerrit:1236761|Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values" (T414080)]], [[gerrit:1236762|Revert "DatabaseUserImpactStore: log attempts to save zero pageviews values" (T414080)]] (duration: 16m 20s)

I‌ confirm that the values written in the binlogs are much smaller and even when I‌ decoded a couple of large ones, nothing had zeros in it. I'm not seeing a major drop in binlog sizes which I think it's mostly because we are RBR‌ so the previous value is still written heavily on the binlogs, waiting for it to actually kick in. It's going to take a while.

I‌ confirm that the values written in the binlogs are much smaller and even when I‌ decoded a couple of large ones, nothing had zeros in it. I'm not seeing a major drop in binlog sizes which I think it's mostly because we are RBR‌ so the previous value is still written heavily on the binlogs, waiting for it to actually kick in. It's going to take a while.

Wasn't the code reverted? https://gerrit.wikimedia.org/r/c/mediawiki/extensions/GrowthExperiments/+/1236762

That bit of logging code helped us figure out what is actually going on:

In RefreshUserImpactJob, we're formatting the user impact before storing it in the database. The change that added this code, Process more articles when fetching page view data, explains that this was done to reduce the number of articles stored in the db:

Reuse a method from UserImpactFormatter to cache only the top 5 articles with page views

However, that formatting code is what is adding back all those zeros:

UserImpactFormatter.php
if ( $date < $data['firstEditDate'] ) {
	// Note this is unreliable for established users, as we look at the user's
	// last 1000 edits to determine firstEditDate. We ignore that issue here.
	$dailyArticleViews[$title]['views'][$date] = 0;
}

See https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/GrowthExperiments/+/refs/heads/master/includes/UserImpact/UserImpactFormatter.php#120

Change #1237873 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] fix(RefreshUserImpactJob): avoid storing formatted data in db

https://gerrit.wikimedia.org/r/1237873

Change #1238292 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] mariadb: Make expire-logs-days configurable in production

https://gerrit.wikimedia.org/r/1238292

Change #1238292 merged by Marostegui:

[operations/puppet@production] mariadb: Make expire-logs-days configurable in production

https://gerrit.wikimedia.org/r/1238292