Page MenuHomePhabricator

IR2 - Marginal gain in non-search-referred pageviews from interventions
Open, In Progress, HighPublic

Description

  • Definition:

Marginal gain in non-search-referred pageviews on segments affected by interventions

  • Technical Implementation/Data Source

Data source: webrequest - use referrer field when the partner sends this information, with a similar treatment/control approach to IR1. When partnerships allow for its usage, calculate pageview lift based on the presence of the wprov parameter from the XAnalytics field.

  • Baseline

0 pageviews - this is an ad-hoc aggregative metric for individual interventions

  • Target

100M pageviews

  • Dashboarding

here

  • Approval

TODO @MMiller_WMF

Details

Due Date
May 5 2026, 11:00 PM

Event Timeline

Milimetric renamed this task from IR1 Marginal pageviews gained from interventions to IR1 - Marginal pageviews gained from interventions.Apr 28 2026, 6:16 PM
Ahoelzl triaged this task as High priority.
Ahoelzl moved this task from Start to In Progress on the Metrics-Sprint-2026-2027 board.
Miriam set Due Date to May 5 2026, 11:00 PM.
Miriam added subscribers: Khantstop, MMiller_WMF, RHo.
Miriam renamed this task from IR1 - Marginal pageviews gained from interventions to IR2 - Marginal gain in non-search-referred pageviews from interventions.May 1 2026, 8:43 AM
Miriam changed the task status from Open to Stalled.EditedMay 1 2026, 2:28 PM
Miriam added subscribers: DTotten-WMF, YLiou_WMF, dr0ptp4kt.

@Khantstop and I had a conversation today about how to implement the metric. Confirming that the definition is Marginal gain in non-search-referred pageviews on segments affected by interventions, there are a few issues with measuring effects of interventions:

  1. We often are not able to distinguish traffic coming smaller non-search sites in our traffic (so we can't isolate pageviews coming from specific partners unless they are large media sites)
  2. Even if we look globally at the whole non-search traffic, the data is very noisy and varies a lot month-by-month (so we can't measure any effect on this data)
  3. Only some partners might activate the provenance parameter wpprov

So we came up with three proposals that we need to further check with our collaborators

  1. Get a visibility metric through regular surveys to be discussed with @YLiou_WMF
  2. Using the wprov parameter, tag "treatment" pages as the pages that have been included into the experiment, and define a "control" set of similar pages, then compare pageview gain
  3. Using the wprov parameter, simply count the Lift in Pageviews with Provenance from partnered websites.

@dr0ptp4kt as you know a lot about the wpprov parameter, could you help us here? Is there a dashboard where we can see the current split of pageviews by the provenance parameter? If not do you think it would be easy to build?
CC @DTotten-WMF

Setting this as "Stalled" until we get unblocked on these implementation issues.

After discussing with @YLiou_WMF, let's exclude the survey-based impact measurement for this one. The data would be too sparse and measuring the effect through surveys impossible. Surveys can be used if we want to set realistic targets for interventions (answering questions like: how likely are people to click on a Wikipedia link after seeing attribution?).

@Miriam happy to do a working meeting next week if you can find a mutually available time in case of any questions. I believe @JMoore-WMF may also be coordinating another working meeting, so if you and Justin can combine efforts here with @Maryana so we can put our minds together synchronously I think that's best (I have limited availability, but am happy to do 45-60 minutes next week across a session or two). Also happy to interface one-to-one on Meet (just schedule the time) if a coordinated mutually available time isn't possible for everyone.

Sandra's ideas in the Doc are a good. The Sec-Fetch-* headers should be logged into X-Analytics, ideally, so that they're trivially obtainable by any data inclined person.

For people wanting to test whether links and their clicks and their embeds and Ajax calls and native HTTP calls ... and on-and-on ... are conveying the Sec-Fetch-* headers in the way they think they should, it can be a little tricky unless one sets up one's own intercepting proxy (and some apps and platforms defend robustly against that sort of thing at all). Although, if the Sec-Fetch-* allowlisted headers set would be permitted to flow into X-Analytics then it should become possible in nearer realtime with a kafkacat, or for those without that access they could check on wmf.webrequest later for their navigations (or both) from Jupyter.

The behavior for what values get sent when the user navigates by clicking a link that opens from an app into a full blown separate web browser versus an app into an embedded webview (and what happens when they click on another link from within the webview) anecdotally varies; I haven't done a full spelunking on that or the OS-to-browser and OS SDK-to-in-app-webview mechanics, but it gets a little tricky...tricky in a fashion that bots of course can emulate as well. For the typical end user of a web browser using normal web browser settings just visiting websites and clicking things and getting the behavior the web browsers do, the headers are predictable; when they have extra plugins and other services layered on top of their browser that expressly change this behavior, it can get complicated, of course.

And, it is also the case that some app developers (mobile and desktop apps) can do clever things with interception and re-writing of headers and URLs, proxies, intermediate caches, and so forth.

But, anyway, it's good to have the extra signals in X-Analytics, which can be bundled with indicators like whether its deemed user traffic or other signals about the likelihood of the traffic being legitimately sourced traffic (user or bot, or agentic browser acting on user's behalf or otherwise, all the usual and emerging stuff), as it will help with hazarding better guesses on fluctuations of the counts and fluctuations of ratios of counts by dimension.

Thank you @dr0ptp4kt @Khantstop @DTotten-WMF and @Snwachukwu for the very productive meeting yesterday.

The summary of it is:

  • wprov parameter is a viable solution to measure impact of interventions with non-search partners, assuming we are able to negotiate its inclusion in the intervention
  • Must be used in conjunction with referrer when possible to exclude scrapers copy-pasting the urls with wprov parameter from our counts.
  • How do we ensure that the wprov data imports reliably into our data lake for impact tracking?
    • We need to map which wprov code identifies which partner so we can distinguish different intervention
    • Partners need to use the correct parameter formatting so that we are able to parse the XAnalytics field on our end
    • We’d likely need other signals (e.g. sec-fetch) to operationalize this parameter for our measurement.

Thanks @Miriam et al for your review and detailed notes! It's helpful to know that wprov is a viable solution for measuring impact. Confirming the intention is that being able to measure impact would always be a pre-requisite for our IR2 interventions.

Miriam changed the task status from Stalled to In Progress.May 8 2026, 9:06 PM
Miriam updated the task description. (Show Details)
Miriam moved this task from In Progress to Defined on the Metrics-Sprint-2026-2027 board.

Some updates on the Target discussion.
TL;DR: we are landing around a target of 100M pageviews as marginal gain obtained via non-search interventions.

Rationale:

  • We looked at an estimate of candidate non-search interventions for FY27, ranked them by likelihood that we will partner with each platform, and attached an estimate of pageview lift based on existing data. (yields a ~160M pageview target)
  • We also looked at the ratio of search to non-search traffic (70:30) and worked out an estimate for IR2 targets by applying the ratio to IR1 targets. (yields a ~210M pageview target, an uppoer bound given that we don't know how much of the non-search traffic is actually third parties that don't send us a referrer)
  • Given the amount of unknowns in this scenario, we are using a lower end estimate of a 100M PV/year gain (this is 0.29%, or one day, of all non-search traffic for the year) as the target.

Next steps:

  • @RHo reviews the target estimates with @SCherukuwada
  • Work out the KR targets for Q2 to be around 10-20% of the total (to account for the in flight or not yet implemented by partners).