Background
Prior to deploying the Page Previews feature on en- and dewiki, we would like to perform an A/B test on those wikis to gauge both user behavior and the effects of PP on fundraising.
Currently, given the user's session ID, the client:
- Divides anonymous users into two buckets, "control" and "on", to determine whether PP should be enabled; we refer to this process as bucketing.
- Divides all users into two buckets, "control" and "on", to determine whether the EventLogging instrumentation should be enabled; we refer to this process as sampling.
By bucketing and sampling users separately we end up silently dropping a lot of data for either case. This shouldn't be the case.
To solve this we'll not sample users, i.e. we'll collect all data from users in the on and control buckets. In order not to overwhelm the EventLogging pipeline, we'll introduce a third "off" bucket for which we're not collecting data. The sizes of the control and on buckets should remain equal and will be considerably smaller than the size of the off bucket: 0.98:0.01:0.01 (off:control:on).
acceptance criteria
- On en- and dewiki, anonymous users will be split into three buckets using their session ID (mw.user.sessionId()):
- experiment (preview on, gathering data)
- control (previews off, gathering data)
- off (previews off, not gathering data)
- If the user falls into a bucket that should be gathering data, then all data is sent to the server.
- The instrumentation still respects DNT.
- For all other wikis that the feature is deployed to, 100% of anonymous users should still receive the PP code and the EventLogging instrumentation is disabled.
- The $wgPopupsAnonsEnabledSamplingRate and PopupsSchemaSamplingRate config variables are removed.
- The $wgPopupsAnonsExperimentalGroupSize? config variable defines the on/control bucket size
- Events are not logged for logged in users.
- There should be a kill switch for EventLogging. This allows us to disable EventLogging in the event we want to enable for a larger bucket size or too many events are being logged.
Closed Questions
- Should the existing behavior (described in the Background section above) be unaffected on those wikis that the client is currently deployed to?
@phuedx: According to note #2 below, we're going to stop collecting data from other wikis (presumably so that we can collect as much data as possible from en- and dewiki).
Sign off steps
- Clean up config relating to $wgPopupsAnonsEnabledSamplingRate and $PopupsSchemaSamplingRate (or create a task to do so)
notes
- Users in all groups will be able to enable the feature using the footer link (for the on group, the feature will be enabled by default).
- Sampling for other wikis should be zero prior to deployment. Tests on other wikis will be turned off.
- @phuedx: This should be moved into the deploy task.