Page MenuHomePhabricator

[SessionLength] Change sampling rate to 10%
Closed, ResolvedPublic

Description

The current implementation of SessionLength has a sampling rate of 1/100.
After scaling EventGate, it can support a sampling rate of 1/10 across all wikis, so we want to increase our sampling rate so that the data is more accurate for smaller wikis.

Event Timeline

Change 668553 had a related patch set uploaded (by Mholloway; owner: Michael Holloway):
[operations/mediawiki-config@master] WikimediaEvents: Bump session_tick sampling rate to 5%

https://gerrit.wikimedia.org/r/668553

Oops, assigned to @mforns. I teed up a patch for you. ;)

Wow @Mholloway, you're ninja-fast!
Thanks for the patch :-)

Change 668750 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/deployment-charts@master] eventgate-analytics-external - Bump replicas to 6 for increase in mediawiki.client.session_tick

https://gerrit.wikimedia.org/r/668750

Change 668750 merged by jenkins-bot:
[operations/deployment-charts@master] eventgate-analytics-external - Bump replicas to 6 for increase in mediawiki.client.session_tick

https://gerrit.wikimedia.org/r/668750

Mentioned in SAL (#wikimedia-analytics) [2021-03-08T14:52:01Z] <ottomata> altered topics (eqiad|codfw).mediawiki.client.session_tick to have 2 partitions - T276502

mforns renamed this task from [SessionLength] Change sampling rate to 5% to [SessionLength] Change sampling rate to 10%.Mar 8 2021, 3:02 PM
mforns triaged this task as High priority.
mforns updated the task description. (Show Details)

@kzimmerman
I talked with @Ottomata last Friday, about scaling up EventGate, and we decided to do it now.
By our discussion in our previous meeting, I think that is OK with you.
So, @Ottomata already doubled the number of EventGate instances, and we can move on to increasing session_tick sampling rate to 10%, if we want.
I went ahead and changed the title of this task to 10% (and modified the corresponding code change), but LMK if there's any problem with that!
If we start collecting events at 10%, we'll already be able to report with more accuracy on smaller wikis.

However, I believe we should continue to work on sampling rate by wiki, to make our data collection more efficient in terms of event bandwidth.
Sampling rate by wiki will allow us to be more accurate in smaller wikis (we could go 100% there), and collect much less events overall (i.e. go 1% on huge wikis), and consume much less resources.
@Ottomata mentioned that it is already possible to configure sampling rate by wiki!
So the remaining part would be to pass the sampling rate information as a field of the schema, and use it in the back-end computation to compensate for sampling bias.

Anyway, please LMK if it's OK to move forward and switch to 10%.

@mforns This is great news! Yes, please move forward with changing the sampling rate to 10%.

I agree that in the long run we still want to be able to specific sampling rate by wiki.

Change 668553 merged by jenkins-bot:
[operations/mediawiki-config@master] WikimediaEvents: Bump session_tick sampling rate to 10%

https://gerrit.wikimedia.org/r/668553