Page MenuHomePhabricator

PoC track Getting started notification CTR with xLab
Closed, DeclinedPublic3 Estimated Story Points

Description

See parent task for context. The scope of this task is to make use of xLab tooling to track the CTR for one of the notifications GrowthExperiments sends to newcomers, specifically the Getting started one.

Acceptance criteria

  • Events are logged through the existing mediawiki.product_metrics.growth_product_interaction and /analytics/product_metrics/web/base/latest
  • Getting started notification CTR is computed and shown in some conventional chart

Open questions:

  • What's the conventional tool to keep a chart for a long-lived product health metric captured through xLab.

Event Timeline

KStoller-WMF set the point value for this task to 2.Jul 21 2025, 4:04 PM

What's the conventional tool to keep a chart for a long-lived product health metric captured through xLab.

Grafana - Time-series chart.
cc@Sgs

Cyndymediawiksim changed the point value for this task from 2 to 3.Jul 24 2025, 3:33 PM

Updating this to 3 due to increased scope

Change #1173375 had a related patch set uploaded (by Cyndywikime; author: Cyndywikime):

[mediawiki/extensions/WikimediaEvents@master] [WIP]:Add GE notification CTR tracking instrumentation

https://gerrit.wikimedia.org/r/1173375

Change #1173396 had a related patch set uploaded (by Cyndywikime; author: Cyndywikime):

[operations/mediawiki-config@master] Add GetStartedNotification experiment

https://gerrit.wikimedia.org/r/1173396

Change #1173982 had a related patch set uploaded (by Cyndywikime; author: Cyndywikime):

[mediawiki/extensions/GrowthExperiments@master] Add notification tracking for Growth Experiments Echo notifications

https://gerrit.wikimedia.org/r/1173982

I'm looking at different ways to instrument Echo notifications and here's what I could collect:

  • User notifications (alerts & messages) are initially collected in Echo from the onSkinTemplateNavigation__Universal to be able to add the two sections with the right counters. There's no per-category data but it would be relatively easy to compute that too. On the other hand this is all on the server and I don't think we should consider seeing the counter bubble as an impression, even if that includes a GE notification.
  • The most standard impression would come from a user clicking on the messages icon and the notifications popup/overlay showing. At that point we could record an impression for each GE notification shown including the category (get started, re-engage, keep going). The only available pinch point I could find without modifying Echo is to use the ext.echo.popup.onInitialize hook fired when the popup initializes.
  • For counting clicks there are three entry points: primary link, secondary link, mark unseen button. I believe we should be able to calculate a separate CTR for each: CTRp, CTRs, CTRu. For the scope of this task it would be fine to get one, CTRp. There aren't any category dedicated CSS classes or HTML attributes that we can use to instrument only the relevant GE notifications so we can add them or rely on whatever the Echo API returns when it queries for notifications.

In conclusion I think that notifications all instrumentation fits better on the client. On a second iteration of this work it would be interesting to provide a more standard way to instrument notifications from Echo rather than GE, for instance using some declarative API in extension.json config.

Sgs updated the task description. (Show Details)

Some general thoughts:

  • Calculating a global or aggregated CTR for all the entry points notifications have in MW seems a big project. I didn't think of this before we choose the notifications as our product use case for this hypothesis. Users could see them through Echo's menu (which is what we're instrumenting right now), but also through Special:Notifications directly or via email, and probably others I'm missing. Since the global computation is more complex I think it is wise to just leave it out for the scope of this task.
  • It's not strictly necessary that we use mediawiki.product_metrics.growth_product_interaction as it seems using product_metrics.web_base would be just fine and it would match the config created in https://mpic.wikimedia.org/read/growth-experiments-getting-started-ctr. The config is very similar as in it drops user-agent collection but has some pre-selected provided values like performer_active_browsing_session_token that may come handy if we use the ClickThroughRateInstrument. However that won't get us any automated dashboard so we still need to figure out how to query the data and plot some chart, there's a snippet in the guide Clickthrough_Rate#Clickthrough_per_user that could be a starting point. There's also a metric catalog in xLab that may contain useful query templates, but I'm not sure how much should Growth engineers invest on this.

We're learning calculating a proper CTR is complex enough to get the benefits of using xLab's tooling but we need to invest the resources on learning and setting up superset dashboards or alike to be able to monitor these product health signals. The suggestion from management is that we wait until @mpopov is back from his sabbatical since he's the best equipped person to help us think about something like automated long-lived metric analysis. Based on this my suggestion is that we move as quickly as we can to the second use-case of the hypothesis aka create an experiment with xLab (T401308) so we don't get stuck on the long-lived metric analysis. cc @Michael @KStoller-WMF

Echo notifications menu and ClickThroughRateInstrument issues

Echo menu item clicks stop propagation
Echo notifications are built as menu item widgets which have several features like default actions or the bundled notification version which collapses/expands several notifications. For whatever reason Echo markup for notifications nests the secondary link into the primary link and in a fix change from some years ago e.stopPropagation() was introduced to fix some undesired actions when clicking on the nested link. This makes the secondary link click event not triggering ClickThroughRateInstrument click listener because the event does not bubble. We've somewhat work-around that by using useCapture = true and listen in the capture phase but that has reveled a second issue: clicking in the innermost link will also trigger a click event in the outermost.

ClickThroughRateInstrument can submit several interactions for a single click
The ClickThroughRateInstrument click listener iterates through state entries and uses the check stateEntry.element.contains( event.target ) for submitting the interaction, which is true in both cases. This makes it trigger two times for click events on the inner most link and recording an additional undesired event. We are exploring how to fix this and if it would make sense for limit ClickThroughRateInstrument to handle the nested elements use case and only record a single click interaction based on a document click rather than potentially recording several, or use the state entries order as an indicator of which interface should trigger the interaction first, something like:

document.addEventListener( 'click', ( event ) => {
    const iterator = state.values();
	for ( const stateEntry of iterator ) {
		// Note well that e.contains( e ) return true. This handles the simple case where the event
		// target is an element that is being tracked by the instrument.
		if ( stateEntry.element.contains( event.target ) ) {
			submitInteraction( stateEntry, 'click' );
			break;
		}
	}
}, true );

This seems common enough to add by default but otoh could record clicks that do nothing after. Another alternative would be to set listener per-element and allow to pass a capture flag where necessary. Any thoughts? cc @phuedx @Michael

ClickThroughRateInstrument can submit several interactions for a single click
The ClickThroughRateInstrument click listener iterates through state entries and uses the check stateEntry.element.contains( event.target ) for submitting the interaction, which is true in both cases. This makes it trigger two times for click events on the inner most link and recording an additional undesired event. We are exploring how to fix this and if it would make sense for limit ClickThroughRateInstrument to handle the nested elements use case and only record a single click interaction based on a document click rather than potentially recording several, or use the state entries order as an indicator of which interface should trigger the interaction first, something like:
<snip />
This seems common enough to add by default but otoh could record clicks that do nothing after. Another alternative would be to set listener per-element and allow to pass a capture flag where necessary. Any thoughts?

I see the problem but I'm not convinced the fix is to stop processing once we've sent one one event because there's no guarantee of the order that the developer will call start() to start tracking clicks on elements. In your case, you'd want to track the secondary link first because the primary link contains the secondary link. I can also imagine more general cases where a third-party WMF developer wants to track a clicks on a parent element that you're also tracking.

I wonder if there's a need for a flag that the developer can set to change the stateEntry.element.contains( event.target ) check? For example:

WikimediaEvents/modules/ext.wikimediaEvents.xLab/ClickThroughRateInstrument.js
	for ( const stateEntry of state.values() ) {
		const isTarget = stateEntry.element == target;

		if ( isTarget ) {
			submitInteraction( stateEntry, target );

			continue;
		}

		const hasTargetAsDescendant = stateEntry.element.contains( target );

		if ( stateEntry.shouldTrackDescendants && hasTargetAsDescendant ) {
			submitInteraction( stateEntry, target );
		}
	} );

We could introduce a new, well-named method that also sets the shouldTrackDescendants flag on the state entry.

Sgs moved this task from Code Review to Doing on the Growth-Team (Current Sprint) board.
Sgs added a subscriber: Cyndymediawiksim.

Thanks for looking into this, the problem with the check const isTarget = stateEntry.element == target; is that it won't work for clicks on inner elements of the relevant element to instrument. I agree that not tracking descendants makse sense as a default, but so far the only way I can think to make the listener not duplicate events with Echo's markup is to strengthen the "contains" check to contains and no otherstate entry contains it, still thinking on leaner ways to get this working. For now we're only instrumenting the primary link.

Change #1173982 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] Add notification tracking for Growth Experiments Echo notifications

https://gerrit.wikimedia.org/r/1173982

Change #1182870 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[operations/mediawiki-config@master] [Growth] beta: enable growth notifications tracking

https://gerrit.wikimedia.org/r/1182870

Change #1182870 merged by jenkins-bot:

[operations/mediawiki-config@master] [Growth] beta: enable growth notifications tracking

https://gerrit.wikimedia.org/r/1182870

Change #1173375 abandoned by Sergio Gimeno:

[mediawiki/extensions/WikimediaEvents@master] [WIP]:Add GE notification CTR tracking instrumentation

Reason:

For now the instrument code will live in GrowthExperiments, done in I6ce5f1787aa7bbb903014a7143487fcd7ff1be3d

https://gerrit.wikimedia.org/r/1173375

The instrument code is already available and enabled in Beta cluster and a matching configuration exists in mpic.wikimedia.org/read/growth-experiments-getting-started-ctr. That means we're moving from the Launch step to Monitor in the Experimentation_Lab/Measure_product_health guide. As I understand the flow we're one click away from starting collecting the data in Hive.

Next step is to discuss with @mpopov and @Iflorez what would be a fair query plan for this data and plot it in some superset dashboard. xLab has Automated_analysis_of_experiments and we'd be looking after some similar approach in which we can plot the CTR for each notification GrowthExperiments sends to newcomers.

Note about secondary link: we're still discussing the best approach for the secondary link, see T400048#11079207 hence the existing instrument is only for primary links. We expect the learning from plotting the primary link CTR will make the addition of the secondary CTR easy.

Moving this to blocked until the discussion with Mikhail and Irene happens and Sam and I figure out the secondary link thing.

Change #1183712 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] NotificationsTracking: track secondary link

https://gerrit.wikimedia.org/r/1183712

Change #1183712 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] NotificationsTracking: track secondary link

https://gerrit.wikimedia.org/r/1183712

Change #1184824 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/WikimediaEvents@master] [Poc] ClickThroughRateInstrument: add .create() factory method

https://gerrit.wikimedia.org/r/1184824

After discussing with Experimentation Platform team about the opportunities to build some automated dashboard for the CTR metric in Superset the conclusions were that the cost of building something custom in MW/xLab that would fit this use case would take 3-5 weeks of work as estimated by @mpopov and that the work would be redundant with the upcoming plans of EP to adopt (growthbook). That's because Superset lacks a concept of metric catalog and each metric monitored would require a new tab, manual setup, etc.

Since Growth already has a KPI dashboard in Grafana, the effort to build something similar at this point seems not worth, the team will wait for EP to work on the growthbook integration.

Given the above, I'm reverting/removing the related changes and mpic configs.

Change #1187769 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):

[mediawiki/extensions/GrowthExperiments@master] analytics: remove long-lived instrument

https://gerrit.wikimedia.org/r/1187769

Change #1187769 merged by jenkins-bot:

[mediawiki/extensions/GrowthExperiments@master] analytics: remove long-lived instrument

https://gerrit.wikimedia.org/r/1187769

The relevant code has been removed. Declining this until Growthbook integration happens in xLab.

Change #1173396 merged by jenkins-bot:

[operations/mediawiki-config@master] beta(Growth,MetricsPlatform): add notification experiment config and enable

https://gerrit.wikimedia.org/r/1173396