Page MenuHomePhabricator

Make it possible to conditionally oversample Edit events
Closed, ResolvedPublic

Description

In mediawiki/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.trackSubscriber.js, mediawiki/extensions/WikiEditor/includes/WikiEditorHooks.php, and mediawiki/extensions/WikiEditor/modules/ext.wikiEditor.js there is code that samples 6.25% of interactions for logging data to the Edit schema.

For our work in understanding first day, we want to be able to sample at 100%. Probably the easiest way is to use a configuration variable we'll be using elsewhere ($wgWMEUnderstandingFirstDay), and if that is set to true, send all transactions instead of just 6.25%.

Event Timeline

When $wgWMEUnderstandingFirstDay is true, do we want to capture 100% of traffic or accounts that are less than 24h old?

When $wgWMEUnderstandingFirstDay is true, do we want to capture 100% of traffic or accounts that are less than 24h old?

If both $wgWMEUnderstandingFirstDay and PageViews::userIsInCohort( $user ) are true, then we should log 100% of events. Otherwise we should stick with the 6.25% sampling.

Change 467542 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/WikimediaEvents@master] Make sampling rate for Schema:Edit configurable

https://gerrit.wikimedia.org/r/467542

Change 467543 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/VisualEditor@master] Use Schema:Edit sampling rate config from WikimediaEvents

https://gerrit.wikimedia.org/r/467543

Change 467544 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/WikiEditor@master] Use Schema:Edit sampling rate config from WikimediaEvents

https://gerrit.wikimedia.org/r/467544

nshahquinn-wmf renamed this task from Add configuration variable for sampling rate on Edit schema to Make it possible to conditionally oversample Edit events.Oct 16 2018, 1:24 AM
nshahquinn-wmf subscribed.

@Catrope's current plan:

  • Migrate VE and WikiEditor to the mw.eventLog.Schema class, which was created after the VE instrumentation code was written
  • Make the sampling rate configurable (like in this WIP change, but as an actual number rather than a weird hacky hex thing)
  • Add oversampling support to mw.eventLog.Schema, which adds isOversampled=true to the event data for oversampled events (i.e. events that bypassed sampling and would not have been logged otherwise)
  • Add code to WikimediaEvents that triggers oversampling in the cases where we need it
Deskana added subscribers: DLynch, Deskana.

This is great. Thank you!

I've moved this task to "external" to reflect that another team is actively working on it, but I do consider this a shared responsibility, so let me know if you need help or Editing engineering time.

In T206543#4668988, @Neil_P._Quinn_WMF wrote:

@Catrope's current plan:

  • Migrate VE and WikiEditor to the mw.eventLog.Schema class, which was created after the VE instrumentation code was written
  • Make the sampling rate configurable (like in this WIP change, but as an actual number rather than a weird hacky hex thing)
  • Add oversampling support to mw.eventLog.Schema, which adds isOversampled=true to the event data for oversampled events (i.e. events that bypassed sampling and would not have been logged otherwise)
  • Add code to WikimediaEvents that triggers oversampling in the cases where we need it

@Krinkle tells me that he plans on removing mw.eventLog.Schema, so instead of using that, I will standardize on using mw.eventLog.inSample() for sampling, centralize the sampling rate config variable (based on what MobileFrontend does currently), and add a config var to trigger oversampling.

Change 467862 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/EventLogging@master] Add a server-side randomTokenMatch() function

https://gerrit.wikimedia.org/r/467862

Change 467863 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/MobileFrontend@master] Use Schema:Edit sample rate from WikimediaEvents

https://gerrit.wikimedia.org/r/467863

Change 467884 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/WikimediaEvents@master] Add oversampling support for Schema:Edit

https://gerrit.wikimedia.org/r/467884

Change 467885 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/WikiEditor@master] Oversample Schema:Edit events when configured to so

https://gerrit.wikimedia.org/r/467885

Change 467886 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/MobileFrontend@master] Oversample Schema:Edit events when configured to so

https://gerrit.wikimedia.org/r/467886

Change 467887 had a related patch set uploaded (by Catrope; owner: Catrope):
[mediawiki/extensions/VisualEditor@master] Oversample Schema:Edit events when configured to so

https://gerrit.wikimedia.org/r/467887

In addition to the 9 (sorry!) patches tagged with this task, I've also amended https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/464432 (for T205759) to oversample Edit events in the relevant cases. I'm now done with the implementation and ready for initial review, but I haven't yet put together all these 10 patches to test that oversampling actually works in all 3 of the extensions that emit Edit events. I'll do that tomorrow, then move this task to the code review column.

Change 467542 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Make sampling rate for Schema:Edit configurable

https://gerrit.wikimedia.org/r/467542

Change 467863 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Use Schema:Edit sampling rate from WikimediaEvents

https://gerrit.wikimedia.org/r/467863

I have now tested these patches and confirmed that they correctly oversample new users and mark oversampled events as such. They're ready for code review.

Change 467543 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Use Schema:Edit sampling rate config from WikimediaEvents

https://gerrit.wikimedia.org/r/467543

Change 467884 merged by jenkins-bot:
[mediawiki/extensions/WikimediaEvents@master] Add oversampling support for Schema:Edit

https://gerrit.wikimedia.org/r/467884

Change 467887 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Oversample Schema:Edit events when configured to do so

https://gerrit.wikimedia.org/r/467887

Change 467886 merged by jenkins-bot:
[mediawiki/extensions/MobileFrontend@master] Oversample Schema:Edit events when configured to do so

https://gerrit.wikimedia.org/r/467886

Change 467862 merged by jenkins-bot:
[mediawiki/extensions/EventLogging@master] Add a server-side sessionInSample() function

https://gerrit.wikimedia.org/r/467862

Change 467544 merged by jenkins-bot:
[mediawiki/extensions/WikiEditor@master] Use Schema:Edit sampling rate config from WikimediaEvents

https://gerrit.wikimedia.org/r/467544

Change 467885 merged by jenkins-bot:
[mediawiki/extensions/WikiEditor@master] Oversample Schema:Edit events when configured to do so

https://gerrit.wikimedia.org/r/467885