Page MenuHomePhabricator

Performance review of session length instrument (in WikimediaEvents extension)
Closed, ResolvedPublic

Description

Description

Session length, i.e. the duration of a user session on a site, is a key movement metric that has long been sought by leadership. Previous work on this issue resulted in the 'reading depth' instrument, which was enabled in production for a while but was seen by leadership as failing to answer the questions that they had. It is now disabled.

Definition (session)

  • A user session is defined relative to a domain and a browser instance (i.e. the actual browser process running on the user's device)
  • Different tabs and windows are part of the same browser instance.
  • A browser instance will end when the browser process is terminated.
  • A session for a domain begins when the browser instance navigates to a page under the domain, and the browser instance detects no session currently in progress for the domain.
  • A session for the domain ends when the browser instance terminates, or when the browser instance has not interacted with any page under the domain for 30 minutes.

Definition (interaction)
User interaction is identified with browser events. We have selected a subset which should be informative and able to be optimized.

  • visibilitystatechange
  • mousedown
  • touchstart
  • click
  • keypress
  • scroll (see notes below)

Definition (session length)
The length of a session is measured in ticks. A tick is a unit of time representing a certain number of milliseconds. Every user session's length begins at 0. When a tick has passed, the length becomes 1. After another tick passes, its length will be 2, and so on. Roughly every tick, a session will send its current length as an event. The difference between the number of 'n' ticks received and the number of 'n+1' ticks received, represents the number of sessions of length n. For more details, see: https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/SessionLength

Preview environment

Will update with beta link as soon as merge happens (next day or two).

Which code to review

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/619479

Performance assessment

Please initiate the performance assessment by answering the below:

  • What work has been done to ensure the best possible performance of the feature?
    • Cross browser testing. No direct tab-to-tab communication. The existing asynchronous queue is used to pass information within a tab. The code will not execute when the page is not visible. The timeout intervals are long and can be adjusted via a config change.
  • What are likely to be the weak areas (e.g. bottlenecks) of the code in terms of performance?
    • The key area will be to prevent event handlers from slowing the UI.
  • Are there potential optimisations that haven't been performed yet?
    • Use of vendor-specific features such as passive event handlers for scroll, Chrome's idle event, or debouncing. We wanted to arrive at the best approach through consultation.
  • Please list which performance measurements are in place for the feature and/or what you've measured ad-hoc so far. If you are unsure what to measure, ask the Performance Team for advice: performance-team@wikimedia.org.
    • None

Related Objects

Event Timeline

Gilles triaged this task as Medium priority.Sep 11 2020, 7:46 AM

Passive event listeners for scroll and touch are very well supported across browsers with a standard syntax: https://developer.mozilla.org/en-US/docs/Web/API/EventTarget/addEventListener#Browser_compatibility ("options: passive option"),

You should be using it here, and remember to have a fallback for IE and old versions of other browsers: https://github.com/WICG/EventListenerOptions/blob/gh-pages/explainer.md#feature-detection

I see that this is already being discussed on the patch but not implemented yet. It's definitely a must have for this feature.

Comparing the minimum browser versions for passive event listeners with our user agent statistics, I see that only a maximum of 14.7% of our traffic comes from user agents that may not support passive event listeners. The bulk of that comes from the long tail of user agents that aren't mainstream and actually might support it (since most of them are forks of Chromium or Gecko anyway), it would just be too time-consuming for me to verify.

I think it would be worth tracking clients that don't support the passive flag using feature detection to get the real figure with field data. If it's a very low percentage, then I would argue that the tradeoff of introducing jank for these users on scroll and touch so that we can measure their session length isn't worth it, as we are collecting enough data from other users that browse the sites using reasonably recent web browsers. You can set this up trivially right now with a feature check and a mw.track call to report a count in two different categories.

There would be an even stronger case to be made for that exception if you do collect data for a short period of time and check that session length percentiles are not noticeably different between users that have a user agent capable of passive listeners and those that don't. This would be that you are just measuring more of the same data, at the expense of a worsened user experience for that portion of users.

Passive event listeners for scroll and touch are very well supported across browsers with a standard syntax...you should be using it here, and remember to have a fallback for IE and old versions of other browsers: https://github.com/WICG/EventListenerOptions/blob/gh-pages/explainer.md#feature-detection

Yes, absolutely. The latest patchset removes the touch handler and adds the passive option for the scroll handler:
https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/619479/29/modules/ext.wikimediaEvents/sessionTick.js#129

A previous patchset included the feature detection cited above:
https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/619479/27/modules/ext.wikimediaEvents/sessionTick.js#126

However I removed it based on @Krinkle's reflection in this comment:
https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/619479/20/modules/ext.wikimediaEvents/sessionTick.js#121

The parameter is bool|options, where in older browsers any form of object would be ignored and casted to a bool to use as useCapture value. Using capture=true here seems harmless.

Which I took to mean that rather than perform feature detection for the options object, it was acceptable to simply allow the casting behavior to occur in older browsers. Is there agreement that this is acceptable?

I think it would be worth tracking clients that don't support the passive flag using feature detection to get the real figure with field data. If it's a very low percentage, then I would argue that the tradeoff of introducing jank for these users on scroll and touch so that we can measure their session length isn't worth it, as we are collecting enough data from other users that browse the sites using reasonably recent web browsers.
There would be an even stronger case to be made for that exception if you do collect data for a short period of time and check that session length percentiles are not noticeably different between users that have a user agent capable of passive listeners and those that don't. This would be that you are just measuring more of the same data, at the expense of a worsened user experience for that portion of users.

Agree on both points. To do these, I will need to re-introduce the feature detection.

To make this concrete, I will to do this:

window.addEventListener( 'scroll', setActiveDebounce, { passive: true } );

if ( supportsEventListenerOptions === false ) {
   mw.track( 'counter.sessionTickEventListenerOptionsUnsupported', 1 );
}

and not this:

if ( supportsEventListenerOptions === true ) {
   window.addEventListener( 'scroll', setActiveDebounce, { passive: true } );
} else {
   mw.track( 'counter.sessionTickEventListenerOptionsUnsupported', 1 );
}

and after measuring, if things look fine, we want to do this:

if ( supportsEventListenerOptions === false ) {
   /* don't run the instrument at all */
}

One last question I have, should we be doing

window.addEventListener( 'scroll', setActiveDebounce, { 
   passive: true,
   capture: true
} );

for consistency with the behavior on older browsers where the options object is cast to useCapture = true?

https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikimediaEvents/+/619479

Updated patchset with additions from discussions above. Left two comments for clarification. @Gilles I know this is not supposed to be CR, I apologize. Our team wants to do a better job of staging here, and it's something I'll have to talk with you about to make some changes to our process and not end up in this position again. Thanks for your help so far.