Page MenuHomePhabricator

MPIC: Create Metrics Platform base stream configuration
Open, HighPublic2 Estimated Story Points

Description

T370880

Description

For all client code submitting events to Metrics Platform's monotable, we need the base stream configuration in production for both the web base and app base schemas.

'product_metrics.web_base' => [
    'schema_title' => 'analytics/product_metrics/web/base',
    'destination_event_service' => 'eventgate-analytics-external',
    'producers' => [
        'metrics_platform_client' => [
            'provide_values' => [
                'mediawiki_database',
                'mediawiki_skin',
                'page_id',
                'performer_is_logged_in',
                'performer_session_id',
                'performer_is_bot',
            ],
        ],
    ],
]

Acceptance Criteria

  • A consensus for the names of the web and app base streams
  • A consensus for the producer config (contextual attributes) of the base streams
  • Both web base and app base stream names point to their respective base schemas
  • Update MPIC base/app stream names according to the ones we decided here. (and we no longer need to add them manually to the Action API response) (review)

Required

  • Unit/Integration tests?
  • Documentation?
  • Passed QA?

Event Timeline

It's that naming time again - what should we call this stream?

I propose product_metrics.web_base and product_metrics.app_base

any naysayers? other suggestions?
@VirginiaPoundstone @WDoranWMF @phuedx @Sfaci

About the naming thing I agree on the names @cjming has suggested. They are a bit different from the ones we define for instruments but I think they are well adjusted to the naming convention we are using currently

Based on the second AC, it seems we should reach a consensus for which contextual attributes need to be added to the base streams.My first guess was that, at least, we should add those that are mandatory for the specific base schema the stream is based on. I mean:

But, at the same, I don't know if it makes sense to add any contextual attribute for these static streams because any registered instrument in MPIC will be converted into a sort of dynamic configuration with its own contextual attributes (apart from the rest of the needed configuration to work properly). So, will these "by default" contextual attributes be useful here?

Any thoughts? Please correct if I'm wrong
@cjming @phuedx

But, at the same, I don't know if it makes sense to add any contextual attribute for these static streams because any registered instrument in MPIC will be converted into a sort of dynamic configuration with its own contextual attributes (apart from the rest of the needed configuration to work properly). So, will these "by default" contextual attributes be useful here?

This was my thinking too. I think it's a reasonable to collect as little information as possible by default to protect against the case someone makes a configuration error.

In fact, I think I have to correct my own words. I mentioned above we should add, at least, the required attributes for the Java client library and some others for the web one, but the reality is that Java and PHP clients are going to add some required ones automatically. It wouldn't make sense to do it.

I guess the only attribute we could add, just in case, is the one is mandatory for the JS client library and it's not added automatically for now. The rest of required attributes will be added even if they are not configured explicitly with the stream config and, as we have mentioned, we think it's a good idea to keep the configuration as minimal as possible.

Change #1074396 had a related patch set uploaded (by Santiago Faci; author: Santiago Faci):

[operations/mediawiki-config@master] Metrics Platform monotable: Base stream configuration

https://gerrit.wikimedia.org/r/1074396

@cjming @phuedx I have just realized that MPIC uses product_metrics.test_app_base and product_metrics.test_web_base as stream names when user chooses App/Web base schema type while filling the Baseline instrument form. I think we should change that. Those stream names should be the same we are going to merge here, right? (and we decided that product_metris.app_base and product_metrics.web_base would be their names (without the test_ part)

@cjming @phuedx I have just realized that MPIC uses product_metrics.test_app_base and product_metrics.test_web_base as stream names when user chooses App/Web base schema type while filling the Baseline instrument form. I think we should change that. Those stream names should be the same we are going to merge here, right?

Yes. These should be constants in MPIC with inline documentation pointing to the stream definitions in operations/mediawiki-config.

Ok! Thanks!
I'll add a new AC here to cover that requirement and I'll prepare the proper change for MPIC to be aligned with the work done here

A change is already prepared for MPIC to use new names for web/base streams: https://gitlab.wikimedia.org/repos/data-engineering/mpic/-/merge_requests/104

We'll keep that MR as a draft until we merge streams on production to be able to test that feature properly with MPIC. So far base streams were added manually (they didn't exist) and, after being published on production, we should get them using the ActionAPI as we do for the custom ones.

Thanks for details @phuedx!

I have already scheduled the change to be merged

I will be there

I have rescheduled the change to be merged

I will be there

Change #1074396 merged by jenkins-bot:

[operations/mediawiki-config@master] Metrics Platform monotable: Base stream configuration

https://gerrit.wikimedia.org/r/1074396

Mentioned in SAL (#wikimedia-operations) [2024-10-02T08:10:12Z] <hashar@deploy2002> Started scap sync-world: Backport for [[gerrit:1074396|Metrics Platform monotable: Base stream configuration (T373967)]]

Mentioned in SAL (#wikimedia-operations) [2024-10-02T08:12:53Z] <hashar@deploy2002> hashar, sfaci: Backport for [[gerrit:1074396|Metrics Platform monotable: Base stream configuration (T373967)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2024-10-02T08:20:39Z] <hashar@deploy2002> Finished scap sync-world: Backport for [[gerrit:1074396|Metrics Platform monotable: Base stream configuration (T373967)]] (duration: 10m 27s)

Base streams have been already published so we can start using them with MPIC and we can also remove them from being added manually to the Action API response when we need to show the active streams in the MPIC UI.

Regarding the above, there is a related change ready for view: https://gitlab.wikimedia.org/repos/data-engineering/mpic/-/merge_requests/104

  • Documentation?

Does this need to be documented anywhere?

Does this need to be documented anywhere?

@apaskulin is there any good place to put this in the documentation? Should we add it to the MPIC section?

Change #1080816 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[operations/deployment-charts@master] Metrics Platform Instrument Configuration: Deploying to staging

https://gerrit.wikimedia.org/r/1080816

Change #1080817 had a related patch set uploaded (by Clare Ming; author: Clare Ming):

[operations/deployment-charts@master] Metrics Platform Instrument Configuration: Deploying to production

https://gerrit.wikimedia.org/r/1080817

Change #1080817 merged by jenkins-bot:

[operations/deployment-charts@master] Metrics Platform Instrument Configuration: Deploying to production

https://gerrit.wikimedia.org/r/1080817

Change #1080816 merged by jenkins-bot:

[operations/deployment-charts@master] Metrics Platform Instrument Configuration: Deploying to staging

https://gerrit.wikimedia.org/r/1080816