Page MenuHomePhabricator

Modify the JS Client to be used non MW powered sites
Closed, ResolvedPublic

Description

As written, the JS MPC doesn't work out of the proverbial bag, which stops WMF teams adopting the Metrics Platform on sites that aren't powered by MediaWiki.

TODO

Event Timeline

Implementation plan:

  1. Update MetricsPlatform constructor function to not take stream configs as second parameter
  2. Update MetricsPlatform to fetch stream configs via Integration::fetchStreamConfigs() regularly (interval configurable)
  3. Create DefaultIntegration, which can:
    • Fetch stream configs from a wiki with EventStreamConfigs installed (URL configurable)
    • Send events to an EventGate instance (URL configurable)
    • Accept values for contextual values
  4. Update the MPC and integration in MediaWiki-extensions-EventLogging to use the stream configs that are already available

Change 845603 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/libs/metrics-platform@master] [JS] Periodically fetch and update stream configs

https://gerrit.wikimedia.org/r/845603

Change 848436 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/libs/metrics-platform@master] [JS] Add DefaultIntegration

https://gerrit.wikimedia.org/r/848436

Moving back to In Progress while I work on an example script.

I'm meeting with @Jdrewniak tomorrow to discuss the Metrics Platform JS API and functionality when using it on, say, the portals.

Notes and links from my meeting with @Jdrewniak in December:

jdrewniak-sam-2022-12-08-notes.txt
import createClient from '@wikimedia/metrics-platform';

// Fetches stream configs from
// https://meta.wikimedia.org/w/api.php?action=streamconfigs&format=json&formatversion=2&all_settings=1&constraints=destination_event_service%3Deventgate-analytics-external
//
// Events submitted with MetricsClient.submit() or dispatched with MetricsClient.dispatch() will be queue until the stream configs are fetched.
const metricsPlatformClient = createClient(
    streamConfigsOrigin = 'https://he.wikipedia.org',
    eventGateOrigin = 'https://intake-analytics.wikimedia.org'
);

const metricsPlatformClient = createClient(
    streamConfigs = {
        // …
    }
)

metricsPlatformClient.submit(
    'eventlogging_WikipediaPortal',
    {
        // …
    }
);

// --- Alternative

metricsPlatformClient.setContextAttributes( {
    'agent_platform_family': isMobile ? 'mobile_browser' : 'desktop_browser',
    'agent_platform': 'js',
} );

metricsPlatformClient.setSessionId( wmTest.sessionId );
metricsPlatformClient.setPageviewId( wmTest.pageviewId );

metricsPlatformClient.dispatch(
    'portal.landing',
    {
        destination: '…',
        referer: '…',
        country: '…'
    }
);

metricsPlatformClient.dispatch(
    'portal.change_language',
    {
        from: '…',
        to: '…'
    }
);

metricsPlatformClient.dispatch( 'init' );

Stream 1: 'portal.', 0.1 events per session
Stream 2: 'portal.landing' unsampled


Stream 3: 'portal.', 1% of sessions, provide session_id and pageview_id

// ---

metricsPlatformClient.tap(
    'eventlogging_WikipediaPortal',
    ( event ) => [ eventName, customData ]
);

Instrument submits event 1 to stream
Event is submitted
For each tap for the stream
    Run the tap function
    dispatch the result

The function would look like

  ( event ) => [ `portal.${event.event_type}`, { country: event.country } ]

Change 878137 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/libs/metrics-platform@master] [JS] Add integration test

https://gerrit.wikimedia.org/r/878137

Change 845603 merged by jenkins-bot:

[mediawiki/libs/metrics-platform@master] [JS] Fetch stream configs from remote source

https://gerrit.wikimedia.org/r/845603

Change 848436 merged by jenkins-bot:

[mediawiki/libs/metrics-platform@master] [JS] Add DefaultIntegration

https://gerrit.wikimedia.org/r/848436

Change 878137 merged by jenkins-bot:

[mediawiki/libs/metrics-platform@master] [JS] Add integration test

https://gerrit.wikimedia.org/r/878137

Change 891306 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/libs/metrics-platform@master] [JS] Add release script

https://gerrit.wikimedia.org/r/891306

Change 891810 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/EventLogging@master] DNM: lib: Update lib/metrics-platform to 4d08d6e48ea

https://gerrit.wikimedia.org/r/891810

Hi,

FWIW, I don't think you'll be able to easily fit the existent mediawiki/client/metrics_event data model in to non MediaWiki use cases. The schema doc says:

[The schema defines...] The standard contextual attributes that can be recorded when a user performs an instrumented interaction with a MediaWiki instance

Metrics Platform was intentionally designed to be used to instrument actions by users on MediaWiki.

You probably will be able to use it, as there are very few required fields. But, all of the downstream analytics (hive, etc.) tables are going to have a lot of null columns related to MediaWiki.

If you want to use some of the Metrics Platform features (fancy stream config stuff?) but not for instrumenting MediaWiki client interactions, perhaps it would be prudent to design a new (subset) schema of mediawiki/client/metrics_event?

If you want to use some of the Metrics Platform features (fancy stream config stuff?) but not for instrumenting MediaWiki client interactions, perhaps it would be prudent to design a new (subset) schema of mediawiki/client/metrics_event?

Thanks for following up on this.

I think that you're right and I think that we could dovetail making that change (having an explicit "external" (?) Metrics Client) with keeping the MC embedded in EventLogging specialised and as small as possible. I'll take a look at the schema, see what can be extracted, and report back with next steps here.

The above said, I think that the release script patch can still be landed /cc @cjming.

Change 891306 merged by jenkins-bot:

[mediawiki/libs/metrics-platform@master] [JS] Add release script

https://gerrit.wikimedia.org/r/891306

Change 902017 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/libs/metrics-platform@master] [JS] Extract ExternalMetricsClient from MetricsClient

https://gerrit.wikimedia.org/r/902017

Change 902017 merged by jenkins-bot:

[mediawiki/libs/metrics-platform@master] [JS] Extract ExternalMetricsClient from MetricsClient

https://gerrit.wikimedia.org/r/902017

Change 891810 merged by jenkins-bot:

[mediawiki/extensions/EventLogging@master] lib: Update lib/metrics-platform to 3acba3b802

https://gerrit.wikimedia.org/r/891810

Change 905597 had a related patch set uploaded (by Phuedx; author: Phuedx):

[mediawiki/extensions/EventLogging@master] lib: Update lib/metrics-platform to 3d79569a35

https://gerrit.wikimedia.org/r/905597

Change 905597 merged by jenkins-bot:

[mediawiki/extensions/EventLogging@master] lib: Update lib/metrics-platform to 3d79569a35

https://gerrit.wikimedia.org/r/905597

I've verified that the following instruments are still submitting events on the Beta Cluster:

  • sessionTick
  • editAttemptStep
  • desktopWebUIActions
phuedx subscribed.
phuedx claimed this task.
phuedx updated the task description. (Show Details)