Page MenuHomePhabricator

FY25-26 SDS2.1.2 Data Reliability - Debugging event loss
Closed, ResolvedPublic

Description

FY25-26 SDS2.1.2 Data Reliability - Debugging event loss

Product Requirements

STATUS: DONE

ReviewerDate approved
Karen HernandezAug 18 2025
Julie van der HoopAug 7 2025
Sam Smith

Objective/Hypothesis

If we can implement better debugging for event logging, then product teams will know that their experiment is collecting event data as expected, increasing experiment owners’ confidence.

How does this objective/hypothesis relate to organizational goals?

This hypothesis supports the 2025-2026 fiscal year annual plan for product and technology department to deliver on SDS2 Objective KR2.1, which states:

SDS Objective:

Product managers can quickly, easily, and confidently evaluate the impacts of product features on Wikipedia.

SDS2.1: By the end of Q2, experiment and evaluate 3 interventions that help contributors improve the state of vital content on their Wikipedias.

Key Result:

Key Result 1: Experiment owners have confidence that the data collected using Experimentation Platform is accurate.

Why do this?

This work matters in order to know that:

  • the data we are collecting is the right data
  • we properly instrumented
  • all of the events are coming through
  • we debug efficiently

Timeline

By the end of Q1 FY 25-26, FY25-26, we would like to have the code updated and dashboard(s) created to monitor possible bugs, event loss, and integrity of data.

Risks

RiskDescriptionStatusNotes
Timing could be tricky - we need to do the GrowthBook integration and onboarding product teams to the Experimentation Platform this fall. With competing priorities, this work could be superseded by the bigger projects Experiment Platform team is planning.Time, prioritization.Mitigating
If we arrive at a conclusion that the platform is difficult to trusti.e. we find that the event loss rate is a significant percentage (TBD - ~25%? More? Less?) - would this be perceived poorly by product teams?Emerging

Who is involved

Overview - DACI
DriverApproverContributorsInformed
Clare MingJulie van der HoopSteering committee: Sam Smith, Adam Baso, Santiago Faci Additional contributors: Other Experiment Platform team membersPartner product teams we are embedding with while we are onboarding them onto the Experiment Platform.
Details about roles and activities
Team/RoleTypeIndividualsSample Activities
Experiment PlatformDevelopment TeamSam Smith (Tech Lead), Adam Baso, Santiago Faci, Clare MingResearch and implementation
WMF Product ManagementProduct ManagerJulie van der HoopProduct Requirements and planning; prioritization, stakeholder management, etc.

Requirements

Hypothesis Requirements

  • The end product/result for the Experiment Platform team should be more confident in the
    • The Experiment Platform will use these metrics to establish internal and external confidence in the platform. We will also use it to assess any changes that we make to the platform in future.
    • Product Teams will use it to assess whether they should use the platform at all and to establish a baseline for any data collection activities moving forward.

Success Criteria

  • When we see expected activity in the monitoring of the PHP and Javascript SDKs:
    • Dashboards report on error loss for the JS and PHP SDKs
  • Experiment owners have confidence that the data collected using Experimentation Platform is accurate.

Target Outcomes

Provide additional debugging and logging capabilities that raises our confidence and that of our adopting product teams.

Ideal Outcomes:

  • Dashboards, documentation, more/less bugs to fix.
  • Average score of 4 (on 1-5 scale) from post-experiment survey question: How confident do you feel that your configuration was correct and complete?

What is out of scope?

We are currently tackling some technical debt in the xLab UI that will help our users include contextual attributes (and exclude others) for their experimentation needs. This work should not be included in the scope of this hypothesis even though we will need to apply additional logging/debugging once this work is done.

Background & existing research or documentation

Open questions

  • How do we assess what is an acceptable level of event loss?

Product Roadmap

Potential roadmap for this work (rough guess).

  • Mid-August 2025: high level planning and known work outlined in tickets
  • End of August 2025: code implemented/deployed
  • Early to mid-September 2025: dashboards ready and available for monitoring

Milestones

  • Both SDKs have counters for assessing total number of expected events
  • Dashboard(s) created to track numbers of expected events and actual events

Event Timeline

phuedx triaged this task as High priority.Aug 21 2025, 3:31 PM
phuedx added a project: OKR-Work.
phuedx moved this task from Incoming to Backlog on the Test Kitchen board.

Data Reliability - Debugging event loss

Suggestion: be very clear what you mean by 'event loss' here. It is not possible to know if you have not lost any events. But, you can improve monitoring and debug-ability for the ones you can know about (e.g. reported HTTP POST errors, etc.:)