In several extensions (RelatedArticles, Page-Previews) we have moved to sampling logging by buckets determined from a stable session id, in order to consistently log all events in a user session. See T167236: userSessionToken in RelatedArticles schema does not seem to survive beyond one pageview for context, for example.
Things we duplicate across projects right now for sampling by user session:
- Checking if window.navigator.sendBeacon is available
- Calling mw.experiments with mw.user.sessionId and checking bucket to set the the logging on or off
- Setting the logger to mw.track or noop or the schema sampling to 0 or 1 depending on the previous check.
Which is a bit of boilerplate. Nothing too serious but if this type of logging is going to be something common then it would be interesting to abstract it.
Explore different options for a unique API shared across mediawiki extensions to log events sampled by user session.
Some must haves
- Configurable sampling rate
- Initial events will be delivered (to log events like page/feature loaded)
- No php conditional registration of ResourceLoader modules and async loading of such modules in the client
- List proposals of APIs w/ pros and cons
- Discuss
- Create follow up tasks to implement the API, and update existing code to use it
1. Use Extension:WikimediaEvents with new event buses
- Unconditionally log with mw.track
- Implement two more buses in WikimediaEvents that take a configuration param:
- sampled_event.session.<Schema> (config: Config, eventData: any)
- buckets/samples based on mw.user.sessionId
- do we need mw.experiments.getBucket({ name }) in the config or would the schema name be enough to derive a unique name?
-<Schema> (config: Config, eventData: any)
- samples based on mw.eventLog.inSample like mw.eventLog.Schema
- Where Config: { samplingRate: BoundedNumber<0, 1> }
- sampled_event.session.<Schema> (config: Config, eventData: any)
// Log event unconditionally mw.track( 'event.Popups', { action: 'page_loaded' } ) // Log event sampled by page view mw.track( '', { samplingRate: 0.01 }, { action: 'page_loaded' } ) // Log event sampled by user session mw.track( 'sampled_event.session.Popups', { samplingRate: 0.01 }, { action: 'page_loaded' } )
2. Use Extension:WikimediaEvents with the existing bus
- Unconditionally log with mw.track( 'event.<Schema>' )
- Overload the event signature to be able to receive a second parameter that would be logging options
- Now: event.<Schema> (eventData: any)
- Proposed: event.<Schema> (eventData: any, [config: Config])
- Optional config: Config: { samplingRate: BoundedNumber<0, 1>, sampleBy: SamplingStrategy }
- SamplingStrategy: SESSION | PAGE
- SESSION samples based on mw.user.sessionId
- PAGE samples based on mw.eventLog.inSample like mw.eventLog.Schema
- SamplingStrategy: SESSION | PAGE
- Do we need mw.experiments.getBucket({ name }) in the config or would the schema name be enough to derive a unique name?
- Optional config: Config: { samplingRate: BoundedNumber<0, 1>, sampleBy: SamplingStrategy }
// Log event unconditionally mw.track( 'event.Popups', { action: 'page_loaded' } ) // Log event sampled by page view mw.track( 'event.Popups', { action: 'page_loaded' }, { sampleBy: "PAGE", samplingRate: 0.01 } ) // Log event sampled by user session mw.track( 'event.Popups', { action: 'page_loaded' }, { sampleBy: "SESSION", samplingRate: 0.01 } )