Page MenuHomePhabricator

[Spike] Investigate Stable User Identification for Logged-Out A/B Testing
Closed, ResolvedPublicSpike

Description

Background

We aim to implement an A/B test for mobile search recommendations (T378115). While logged-in users can be assigned to consistent test buckets using user IDs, logged-out users lack a stable identifier. Current options like mw.user.sessionId are session-based, meaning users may see both control and treatment groups if their session resets, compromising test validity.

This task focuses on identifying reliable and privacy-compliant methods for consistent test group assignment for logged-out users.

User Story

As the Web team, we want a stable and anonymous way to assign logged-out users to consistent A/B test buckets to ensure reliable and valid experiment results.

Requirements

Investigate the feasibility of using mw.user.sessionId for logged-out users and identify its limitations.

Explore alternative methods, such as:

  • Cookies or local storage for temporary identifiers.
  • Hashing device/browser attributes.
  • Assess privacy and compliance concerns, particularly regarding IP usage or persistent identifiers.
  • Provide recommendations on the most suitable approach for logged-out A/B testing.

BDD

N/A

Test Steps

N/A

Design

N/A

Acceptance Criteria

  • Documented evaluation of available options, including mw.user.sessionId, cookies, and other mechanisms.
  • A clear recommendation for the most suitable method to assign logged-out users to A/B test buckets.
  • Identification of privacy, compliance, and technical trade-offs for each method.
  • Results are shared with relevant stakeholders.

Communication Criteria

  • Notify stakeholders, including the Web Team, Data Engineering of findings and recommendations.
  • Schedule follow-ups if the selected method requires additional review or technical changes.

Rollback Plan
N/A

Blocks:
T378115: Setup an A/B test for relevant users for mobile recommendations
T378117: Add MoreLike-based article suggestions when activating search bar in mobile

Event Timeline

Restricted Application changed the subtype of this task from "Task" to "Spike". · View Herald TranscriptFri, Nov 22, 4:39 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Current Options:
1. Session IDs (mw.user.sessionId)
These are short-lived for most users (<1 day for 98% of English Wikipedia web users), but a small percentage (~2%) have sessions lasting much longer (20–90 days). This variability can impact consistency.
2. Local Storage
Could be used to store and retrieve group membership, but:

  • It requires implementing a mechanism to expire or clear the data after a certain period.
  • Local storage has been discouraged (and possibly disallowed) in the past for performance reasons unless a garbage collection system is included.
NBaca-WMF claimed this task.
NBaca-WMF subscribed.

For this experiment we chose to use session id ; closing with this in mind