Page MenuHomePhabricator

Decommission Session Length v1
Open, LowPublic

Description

Background

In 2020, Product Analytics, Data Engineering, and Product Infrastructure (disbanded since then) teams collaborated on T267494, the outcome of which was a dataset – Session Length (v1) – that would enable us to start understanding how long users interact with our products on the web. The work was part of a program called Better Use Of Data. A side-effect of this work was a session identifier which reset after a certain period of inactivity, and we use this identifier today in performer_active_browsing_session_token contextual attribute.

Version 1 of the session length dataset was very minimal and was mostly a proof of concept for a privacy-preserving, identifier-less way to calculate what was at the time a very standard metric of Internet user behavior.

Furthermore, the metric as-is is not viable as a core or essential metric for Consumer Experience (WE3) work for the following reasons:

  • Limited history available; the data only goes back to 2021 and that makes it hard to understand changing trends
  • Low granularity: we currently can’t segment the data by region (this is the biggest blocker to using it now), type of visitor (e.g. new vs returning), pages visited, features used, etc.
  • The data needs more analysis and maintenance.

See also: thread in #product-tech-dept Slack channel.

Wiki Experiences 3 (Consumer Experiences) is more interested in reader retention, and their experiments are focused on increasing retention rate. Now, it may be that session length is correlated with retention, but the high-level dataset will not help us with this. Instead, we would need to measure session length within the same experiments that we measure reader retention, and analyze that data. We do not need Session Length v1 data to do that.

We collect 95M session tick events per day. We should stop that data collection and we should reconsider:

  • Whether session length is an important means for us to understand user engagement
  • What session length data should look like (what dimensions it should have, how experiment-friendly it is) to provide product managers what they need to understand user engagement
NOTE: We still want the session tick instrument's regulator to be active, because performer.active_browsing_session_token contextual attribute is a session identifier that resets after a certain period of inactivity, unlike performer.session_id which is based on MW session ID. So we should disable data collection, but not tick and session reset browser events that instruments can and do subscribe to. In fact, we want to bring the core logic of the instrument into our SDKs and tools. This work is captured in T284223: Create the TimedTick instrumentation component, as part of T406261: [EPIC] SDS 2.4.3 Create a suite of standard metrics to use in experiments.

Acceptance criteria

Event Timeline

  1. I'm not quite understanding all the relationships between session tick, session length, and active browsing session token. If we decommission mediawiki.client.session_tick, then how do we maintain performer.active_browsing_session_token? Do you have a visual that could map the dependencies?
  1. Session length would be a great metric to have defined in our catalog for experiments, to use as a guardrail in experiments
  1. If we turn this off, then do we lose everything we had before? You mention:

Limited history available; the data only goes back to 2021 and that makes it hard to understand changing trends

but we're eliminating a method and introducing a new one, which makes it _really_ hard to understand changing trends! Or is your point that we really shouldn't use this for trends because of low granularity and data analysis/maintenance so the time period is kind of besides the point.

  1. Kudos for asking the question of "what can we stop doing" or "what pipelines can be turned off?"

I'm not quite understanding all the relationships between session tick, session length, and active browsing session token. If we decommission mediawiki.client.session_tick, then how do we maintain performer.active_browsing_session_token? Do you have a visual that could map the dependencies?

Something like this:

sessionTick diagram.png (1×4 px, 247 KB)

flowchart LR
    A[sessionTick.js] -.->|"mw.track( 'sessionReset' )"| B{MediaWiki event queue}
    C(EventLogging/core.js) -.->|"mw.trackSubscribe( 'sessionReset ')"| B
    B -.->|"notifies subscribed handlers"| C
    A -->|"mw.eventLog.submit( { tick: # } )"| D[mediawiki.client.session_tick stream]
    C -->|"resetSessionId()"| E[new active browsing session token ID]

We can turn off data collection (the mw.eventLog.submit() part of the instrument) without affecting any of the session management that the instrument is doing.

active_browsing_session_token uses core.id.sessionId (that gets reset by the session tick instrument), and it's different from mw.user.sessionId that is managed by MediaWiki core.

Session length would be a great metric to have defined in our catalog for experiments, to use as a guardrail in experiments

I agree, but the instrumentation needs substantial rework and be brought into our standard suite of instruments (T406261).

I started an instrumentation spec for a new version of session tick that would enable us to offer session length as a metric for experiments.

If we turn this off, then do we lose everything we had before? Or is your point that we really shouldn't use this for trends because of low granularity and data analysis/maintenance so the time period is kind of besides the point.

We won't lose the session length dataset that we have now, but it would stop growing. An analyst would still be able to analyze the several years of daily session lengths and counts that it has, but given the lack of dimensions, analysis is limited to looking at differences by wiki. There isn't even a way to compare session lengths for logged-in vs logged-out users, or by country.

  1. I'm not quite understanding all the relationships between session tick, session length, and active browsing session token. If we decommission mediawiki.client.session_tick, then how do we maintain performer.active_browsing_session_token? Do you have a visual that could map the dependencies?

@mpopov explained the relationship. I've gone into some detail about how we would solve it here https://phabricator.wikimedia.org/T322094#9866911 but I'll do so again below.

  1. Session length would be a great metric to have defined in our catalog for experiments, to use as a guardrail in experiments

Yes.

I agree, but the instrumentation needs substantial rework and be brought into our standard suite of instruments (T406261).

Yes.

Readers Web laid the groundwork for the rework in T383422: Implement experimental sessionLength instrument for use in Search Recommendations A/B test, which I helped them design (see T378072). Readers Web's reusable SessionTick instrument is still available in the WikimediaEvents extension and could easily be moved over to the MetricsPlatform extension as part of T406261: [EPIC] SDS 2.4.3 Create a suite of standard metrics to use in experiments. However, this wouldn't address the accidental complexity caused by the relationship between SessionTick proper and performer.active_browsing_session_token:

SessionTick proper has three parts: The regulator, the state store, and a sampling event sender. SessionTick is implemented in the WikimediaEvents extension. When the regulator determines that the session needs to be reset, it emits a sessionReset event (not to be confused with analytics event) on an in-memory event bus.

performer.active_browsing_session_token is implemented in the EventLogging extension. The implementation reacts to sessionReset events by regenerating the token.

For performance reasons, the SessionTick proper regulator starts running when the browser is idle (not to be confused with when SessionTick proper determines when the user is idle). This could be immediately. This could be after other JS has finished loading and executing. The latter is what we're seeing in T322094, which I'll try to clarify here:

Instrument A:

  1. Is defined in the WikimediaEvents extension
  2. Sends a page_visit analytics event when it loads and executes
  3. Includes performer.active_browsing_session_token

Instrument A which sends a page_visit event once it starts executing.

  1. JS in the EventLogging extension is loaded and executed
    • mw.eventLog.submitInteraction() is available
    • performer.active_browsing_session_token is available
  2. JS in the WikimediaEvents extension is loaded and executed
    • SessionTick proper queues the regulator to start when the browser is idle
    • Instrument A sends a page_visit event, which includes performer.active_browsing_session_token
  3. The browser determines that it's idle and the SessionTick proper regulator is started
    • The regulator determines that the session needs to be reset and emits a sessionReset event
    • performer.active_browsing_session_token is reset

In my opinion, there are two major issues that need to be addressed:

  1. A key context attribute offered by the Metrics Platform JS client depends on an analytics instrument; and
  2. The Metrics Platform clients have no concept of dependency on context attributes

In order to address (1), we should move the SessionTick proper regulator into the Metrics Platform JS client and use it to drive performer.active_browsing_session_token. We should also provide an API that allows experiment implementors to use it in their experiment-related analytics instruments. We should be clever about it though.

As I've said above, SessionTick proper regulator is started when the browser is idle and this part of the design shouldn't be gotten rid of. Instead, we should update the Metrics Platform JS client to model a context attribute being available after a delay by delaying sending analytics events that use those context attributes until they are available. This addresses (2).

I've been trying to think about how the Event Platform, the Metrics Platform, and xLab all layer up. As you can see from the above, the layering isn't clear. This is where we are right now:

image.png (292×857 px, 37 KB)

And this is what we're aiming at:

image.png (365×685 px, 39 KB)

Thanks for all of the clarification @phuedx @mpopov. Mikhail also confirmed that no one will be immediately impacted if we turn off data collection -- No one uses it, because they know it's not good enough.

Met with Kate, and there is renewed interest in this session length data – particularly now that traffic patterns are drastically changing.

She acknowledged the limitations of the current dataset, but is not comfortable with us shutting it down without a replacement. As approval depends on availability of a viable alternative, I'm going to drive T406643: Session Length v2 dataset (as time/capacity allows as it's not high priority work).

Next steps: I will discuss with @phuedx how safely we can decouple the session management logic from the current instrument without affecting the data collection. We will likely need to stand up a separate tick-based session length instrument and use that both for v2 of the dataset and as part of the suite of standard instruments.

mpopov triaged this task as Low priority.
mpopov updated the task description. (Show Details)
mpopov added a project: Product-Analytics.
mpopov updated the task description. (Show Details)