In several extensions (RelatedArticles, Page-Previews) we have moved to sampling logging by buckets determined from a stable session id, in order to consistently log all events in a user session. See T167236: userSessionToken in RelatedArticles schema does not seem to survive beyond one pageview for context, for example.
Things we duplicate across projects right now for sampling by user session:
- Checking if window.navigator.sendBeacon is available
- Calling mw.experiments with mw.user.sessionId and checking bucket to set the the logging on or off
- Setting the logger to mw.track or noop or the schema sampling to 0 or 1 depending on the previous check.
Which is a bit of boilerplate. Nothing too serious but if this type of logging is going to be something common then it would be interesting to abstract it.
What
Explore different options for a unique API shared across mediawiki extensions to log events sampled by user session.
Some must haves
- Configurable sampling rate
- Initial events will be delivered (to log events like page/feature loaded)
- No php conditional registration of ResourceLoader modules and async loading of such modules in the client
AC
- List proposals of APIs w/ pros and cons
- Discuss
- Create follow up tasks to implement the API, and update existing code to use it
Proposals
1. Use Extension:WikimediaEvents with new event buses
- Unconditionally log with mw.track
- Implement two more buses in WikimediaEvents that take a configuration param:
- sampled_event.session.<Schema> (config: Config, eventData: any)
- buckets/samples based on mw.user.sessionId
- do we need mw.experiments.getBucket({ name }) in the config or would the schema name be enough to derive a unique name?
- sampled_event.page.<Schema> (config: Config, eventData: any)
- samples based on mw.eventLog.inSample like mw.eventLog.Schema
- Where Config: { samplingRate: BoundedNumber<0, 1> }
- sampled_event.session.<Schema> (config: Config, eventData: any)
Examples
// Log event unconditionally mw.track( 'event.Popups', { action: 'page_loaded' } ) // Log event sampled by page view mw.track( 'sampled_event.page.Popups', { samplingRate: 0.01 }, { action: 'page_loaded' } ) // Log event sampled by user session mw.track( 'sampled_event.session.Popups', { samplingRate: 0.01 }, { action: 'page_loaded' } )
2. Use Extension:WikimediaEvents with the existing bus
- Unconditionally log with mw.track( 'event.<Schema>' )
- Overload the event signature to be able to receive a second parameter that would be logging options
- Now: event.<Schema> (eventData: any)
- Proposed: event.<Schema> (eventData: any, [config: Config])
- Optional config: Config: { samplingRate: BoundedNumber<0, 1>, sampleBy: SamplingStrategy }
- SamplingStrategy: SESSION | PAGE
- SESSION samples based on mw.user.sessionId
- PAGE samples based on mw.eventLog.inSample like mw.eventLog.Schema
- SamplingStrategy: SESSION | PAGE
- Do we need mw.experiments.getBucket({ name }) in the config or would the schema name be enough to derive a unique name?
- Optional config: Config: { samplingRate: BoundedNumber<0, 1>, sampleBy: SamplingStrategy }
Examples
// Log event unconditionally mw.track( 'event.Popups', { action: 'page_loaded' } ) // Log event sampled by page view mw.track( 'event.Popups', { action: 'page_loaded' }, { sampleBy: "PAGE", samplingRate: 0.01 } ) // Log event sampled by user session mw.track( 'event.Popups', { action: 'page_loaded' }, { sampleBy: "SESSION", samplingRate: 0.01 } )