Page MenuHomePhabricator

Verify instrument configuration with xLab
Closed, ResolvedPublic2 Estimated Story Points

Description

Before we promise product teams that they can configure their instruments with xLab UI, we need to make sure that an instrument deployed to production and configured with xLab UI works as it should.

We can continue to build on pageVisit.js and instrument pageviews outside of the synthetic A/A tests we're running.

Instrumentation specification

MetricEvent to be trackedInteraction data
Page visitsPage visitedaction: page-visited

Instrument configuration

DetailConfiguration
Instrument name (machine-readable name)Page Visits Demo (page-visits-demo)
Start date2025-06-23
End date2025-06-25
SchemaWeb base
StreamWeb base (default)

Sampling

Sampling unit: Pageview

Sampling rates:

  • 1.0 (100%) on Test (test.wikipedia.org)
  • 0.001 (0.1%) on English Wikipedia
  • 0.01 (1%) everywhere else

Contextual attributes

NOTE: We want the list of contextual attributes to be different from the currently configured web base stream. We are excluding performer_active_browsing_session_token and including mediawiki_version. The other contextual attributes are the same. This will allow us to verify that the client library respects the xLab configuration and that for this instrument it overrides the configuration of the stream to which the events will be submitted.
  • agent_client_platform
  • agent_client_platform_family
  • mediawiki_database
  • mediawiki_skin
  • performer_is_logged_in
  • performer_is_temp
  • performer_pageview_id
  • mediawiki_version

Data collection risk assessment

Per data collection guidelines: Low risk

Open questions

Question 1: If

const pageVisitInstrument = mw.eventLog.newInstrument( "page-visits-demo" );

then would pageVisitInstrument.submitInteraction ( "page-visited" ) produce an event where instrument_name is set to "Page Visits Demo" (the name of the instrument as configured in xLab)?

If that's the expectation, we should add it to AC.

Acceptance criteria

  • An instrument is deployed to production, but is not collecting data until it has been configured and activated in xLab
  • Events flowing into the product_metrics.web_base stream from this instrument have all the listed contextual attributes and none of the unlisted contextual attributes
    • especially mediawiki_version which is not included in the base stream configuration, but is included here
    • not performer_active_browsing_session_token which is included in the base stream configuration, but is not included here
  • Events are queryable from the event.product_metrics_web_base table
  • There are events from all production wikis

Details

Related Changes in Gerrit:
Related Changes in GitLab:
TitleReferenceAuthorSource BranchDest Branch
Update instruments endpoint and contextual attributesrepos/data-engineering/test-kitchen!211cjmingT397363/update-instrumentsmain
Customize query in GitLab

Event Timeline

mpopov triaged this task as High priority.
mpopov updated the task description. (Show Details)
mpopov updated the task description. (Show Details)

hi @phuedx @Sfaci - as I dig into the JS client and root around in the EventLogging extension, I'm deducing that the MetricsClient used by Instrument is instantiated by initMetricsClient() which calls newMetricsClient() with streams configs from the EventLogging hook getModuleData which calls getEventLoggingConfig() which calls loadEventStreamConfigs() which calls the EventLogging.StreamConfigs service which in turn gets stream configs from the EventStreamConfig.StreamConfigs service which gets stream configs from StreamConfigsFactory.

It looks like in StreamConfigsFactory::getInstance():

public function getInstance(): StreamConfigs {
	$streamConfigs = [];
	$this->hookRunner->onGetStreamConfigs( $streamConfigs );

	$streamConfigs = array_merge(
		$streamConfigs,
		$this->options->get( 'EventStreams' )
	);

	return new StreamConfigs(
		$streamConfigs,
		$this->options->get( 'EventStreamsDefaultSettings' ),
		$this->logger
	);
}

the array_merge() function overrides whatever is provided by the onGetStreamConfigs() hook (InstrumentConfigsFetcher) with what is set by static config.

Here's my stream config provided by my local xLab (where I selected contextual attributes) before array_merge:

Screenshot 2025-07-01 at 11.55.27 PM.png (1×3 px, 912 KB)

And here are stream configs after array_merge (note overridden numeric keys):

Screenshot 2025-07-01 at 11.55.39 PM.png (1×3 px, 897 KB)

For reference, here is my local producer config which closely matches prod config (minus the mediawiki attributes):

'product_metrics.web_base' => [
	'schema_title' => 'analytics/product_metrics/web/base',
	'producers' => [
		'metrics_platform_client' => [
			'provide_values' => [
				'agent_client_platform',
				'agent_client_platform_family',
				'performer_is_logged_in',
				'performer_is_temp',
				'performer_pageview_id',
				'performer_active_browsing_session_token',
			],
		],
	],
],

When I was testing this locally, I got events validated for a new PageVisit instrument:

const pageVisitInstrument = mw.eventLog.newInstrument(
	"page-visits-demo",
	"product_metrics.web_base",
	"/analytics/product_metrics/web/base/1.4.2"
);

pageVisitInstrument.submitInteraction(
	"page-visited",
	{ action_context: 'test instrument via xLab'}
);

But as confirmed by the debugging screenshots above, I only got the contextual attributes with the event that were defined by my local $wgEventStreams config:

Screenshot 2025-07-02 at 12.41.07 AM.png (1×2 px, 254 KB)

Actually - I didn't get performer_is_temp - not sure why - need to investigate that more.

Anyway, my question is what is the best approach for making xLab config supercede hard-coded config? IIRC there was some discussion about hard-coded config should always take precedence? Is this is still the case? If an instrument owner wants to override the default contextual attributes set for product_metrics.web_base, do we want to allow for this?

Change #1167322 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[operations/deployment-charts@master] xLab: Deploy v0.7.8 release to staging

https://gerrit.wikimedia.org/r/1167322

Change #1167322 merged by jenkins-bot:

[operations/deployment-charts@master] xLab: Deploy v0.7.8 release to staging

https://gerrit.wikimedia.org/r/1167322

Change #1171319 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[operations/deployment-charts@master] xLab: Deploy v0.7.9 release to staging

https://gerrit.wikimedia.org/r/1171319

Change #1171319 merged by jenkins-bot:

[operations/deployment-charts@master] xLab: Deploy v0.7.9 release to staging

https://gerrit.wikimedia.org/r/1171319

Milimetric set the point value for this task to 2.Sep 25 2025, 6:03 PM