Background
Yesterday, the Web and Data Products teams met to discuss requirements for their upcoming A/B tests. Part of the discussion focussed on the creation of a modularized SessionLength instrument that could be "mixed in" to whatever instrument that the Web team is building.
The modularized instrument would submit all of the events that the SessionLength instrument does but with additional data about the user's session so that it can be used as a guardrail during experiment analysis.
The modularized instrument would not supersede the general SessionLength instrument, which, while it does have a couple of shortcomings, is useful.
Background
The SessionLength instrument collects and reports aggregate information about active browsing sessions across the Wikipedias in a way that preserves user privacy.
The instrument is made up of the following components:
- Regulator
- Persists instrument state to and retrieves it from storage (a cookie)
- Ticks every X milliseconds
- Resets the active browsing session every Y milliseconds
- Resets the active browsing session if the user is inactive for Z milliseconds
- Sender
- Submits heartbeat analytics events to the mediawiki.client.session_tick event stream
- Emits sessionReset events for other JavaScript modules to react to
- Gauge
- Processes raw heartbeat analytics events into an intermediate representation in the wmf.session_length_daily table
- The Session Length Dashboard Superset dashboard uses the data in that table
Heartbeat analytics events do not include identifying information about the user, the page, or the MediaWiki instance that the user is interacting with. You can read more about the privacy considerations of the instrument and the implications of those considerations here.
A consequence of this design is that feature teams cannot understand how changes that they are making impact session length. That is, feature teams cannot currently use the session length as a guardrail metric in their experiments.
Proposal
We propose the creation of a reusable SessionLength instrument mixin that any feature team can mix into their existing instruments. The mixin will submit events to event streams owned by the feature team and therefore could include information about the user, the page, or the MediaWiki instance that the user is interacting with. This will allow the feature team to use session length as a guardrail metric in their experiments.
Advantages
- Performance
- Multiple teams can reuse the mixin, thereby minizing code duplication across codebases and code fetched and executed by the browser. Indeed, the original SessionLength instrument can be rewritten using the mixin
- Consistency
- All instruments submit the same standardized set of events
- Product Analystics can develop one or more reusable queries to analyze data captured by the mixin
- Standalone
- In general, we want one instrument to send all events required for analysis rather than having to rely on multiple collaborating instruments to do so (see also https://meta.wikimedia.org/wiki/Research_and_Decision_Science/Data_glossary/Clickthrough_Rate#Standalone_instrumentation)
Disadvantages
- Easier User Tracking
- This design is antithetical to the design of the original SessionLength instrument outlined in the [Background](#Background) section above
- Increased Data Collected
- By making it easier (indeed, trivial) to collect information about session length and encouraging feature teams to create instruments that send all events required for analysis, we will naturally collect more data
Implementation
// Based on https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/WikimediaEvents/+/refs/heads/master/modules/ext.wikimediaEvents/sessionTick.js // Constants // ========= const NOOP = function () {}; const TICK_MS = 60000; const IDLE_MS = 100000; const RESET_MS = 1800000; const DEBOUNCE_MS = 5000; const TICK_LIMIT = Math.ceil( RESET_MS / TICK_MS ); const KEY_LAST_TIME = 'mp-sessionTickLastTickTime'; const KEY_COUNT = 'mp-sessionTickTickCount'; // State // ===== const state = new Map(); // Functions // ========= function supportsPassiveEventListeners() { let supportsPassive = false; try { const options = Object.defineProperty( {}, 'passive', { get: function () { supportsPassive = true; return false; } } ); window.addEventListener( 'testPassiveOption', NOOP, options ); window.removeEventListener( 'testPassiveOption', NOOP, options ); } catch ( e ) { // Silently fail. } return supportsPassive; } // Optimization: // // If the browser doesn't support the Page Visibility API or passive event // listeners, then stop processing and export the null implementation of // the API so that dependant scripts don't break. if ( document.hidden === undefined && !supportsPassiveEventListeners() ) { module.exports = { start: NOOP, stop: NOOP }; return; } // Functions (continued) // ===================== function sessionReset() { mw.storage.set( KEY_COUNT, 0 ); } function sessionTick( incr ) { if ( incr > TICK_LIMIT ) { throw new Error( 'Session ticks exceed limit' ); } const count = ( Number( mw.storage.get( KEY_COUNT ) ) || 0 ); mw.storage.set( KEY_COUNT, count + incr ); while ( --n ) { state.forEach( ( schemaID, streamName ) => { mw.eventLog.submitInteraction( schemaID, streamName, 'tick', { action_source: 'SessionLengthInstrumentMixin', action_context: n } ); } ); } } function regulator() { let tickTimeout = null; let idleTimeout = null; let debounceTimeout = null; function run() { const now = Date.now(); const gap = now - ( Number( mw.storage.get( KEY_LAST_TIME ) ) || 0 ); if ( gap > RESET_MS ) { mw.storage.set( KEY_LAST_TIME, now ); sessionReset(); // Tick once to start sessionTick( 1 ); } else if ( gap > TICK_MS ) { mw.storage.set( KEY_LAST_TIME, now - ( gap % TICK_MS ) ); sessionTick( Math.floor( gap / TICK_MS ) ); } tickTimeout = setTimeout( run, TICK_MS ); } function setInactive() { clearTimeout( idleTimeout ); clearTimeout( tickTimeout ); clearTimeout( debounceTimeout ); tickTimeout = null; debounceTimeout = null; } function setActive() { if ( tickTimeout === null ) { run(); } clearTimeout( idleTimeout ); idleTimeout = setTimeout( setInactive, IDLE_MS ); } function setActiveDebounce() { if ( !debounceTimeout ) { debounceTimeout = setTimeout( () => { clearTimeout( debounceTimeout ); debounceTimeout = null; }, DEBOUNCE_MS ); mw.requestIdleCallback( setActive ); } } function onVisibilitychange() { if ( document.hidden ) { setInactive(); } else { setActive(); } } document.addEventListener( 'visibilitychange', onVisibilitychange, false ); window.addEventListener( 'click', setActiveDebounce, false ); window.addEventListener( 'keyup', setActiveDebounce, false ); // Use the 'passive: true' option when binding the scroll handler. // Browsers without EventListenerOptions support will expect a // boolean 'useCapture' argument in that position, and will cast // the object to a value of 'true'. This is harmless here. window.addEventListener( 'scroll', setActiveDebounce, { passive: true, capture: false } ); onVisibilitychange(); } // API // === // Start algorithm regulator(); const SessionLengthInstrumentMixin = { start( streamName, schemaID ) { if ( !enabled ) { return; } state.set( streamName, schemaID ); } stop( streamName ) { state.delete( streamName ); } }; module.exports = SessionLengthInstrumentMixin;