Page MenuHomePhabricator

Update Metrics Platform docs on required contextual attributes
Closed, ResolvedPublic

Description

Currently the Metrics Platform docs state that all contextual attributes are optional. However, as part of investigating T374116, we discovered that instrument creators are required to specify agent_client_platform_family in the stream config for the JS and PHP clients.

To do: Update https://wikitech.wikimedia.org/wiki/Metrics_Platform/How_to/Create_First_Metrics_Platform_Instrument and https://wikitech.wikimedia.org/wiki/Metrics_Platform/How_to/Creating_a_Stream_Configuration to reflect that agent_client_platform_family is a required contextual attribute

Event Timeline

@apaskulin I think your updates are great but, after reading your last comment, I have realized that there are some exceptions regarding that mandatory contextual attribute (and some others). The short story here is that not all the client library implementations (PHP, Java and Android) work same way:

  • Android client library is already adding this contextual attribute (agent_client_platform_family) automatically (apart from others, I think. Maybe we should review this). I think that would explain why android streams are not providing this attribute in the stream configuration
  • PHP client library, according to the code (https://gitlab.wikimedia.org/repos/data-engineering/metrics-platform/-/blob/main/php/src/ContextController.php?ref_type=heads#L30) it seems that it also includes this contextual attribute automatically with another one (agent_client_platform)
  • JS client library is the only library that is not adding any contextual attribute automatically

Some streams are quite old and are using specific schemas (not the common ones we suggest with MP) but some of them were migrated and they should use the common ones. But, even son, some of them are not including the mandatory contextual attributes but I don't know if someone is owning or working on that.

All that could explain that some streams don't include agent_client_platform_family. But I have found some others that seem to be related to the JS client library and they don't provide the mandatory contextual attribute. Won't those streams/instruments be used? Let me explore this a bit more because maybe we should explain something else about contextual attributes (which ones are mandatory and regarding which client library)

@apaskulin

I found some new useful information regarding what I said above, wrongly, about the Java client library. That client is not adding only agent_client_platform_family but all agent contextual attributes automatically according to the REQUIRED_PROPERTIES list and how its content is added automatically. So, not only the one I mentioned above is added automatically for this client.

I think we should leave the clients to set fields on the agent property. That is, I think we should update the Metrics Platform JavaScript client library to at least match the PHP client library. I think that the documentation should only mention that the clients are responsible for setting it, perhaps with links to the source for each client.

Thanks for these comments, @Sfaci and @phuedx. It seems like there's some variation in which contextual attributes are added automatically by each of the clients. I think my question is: If I'm using a client that adds some contextual attribute automatically, but I don't list that attribute in my stream config, will things break? Will that attribute still appear in my event data? If it's possible to leave the clients to set fields on the agent property (as phuedx said) and not worry about adding them to the config, that would help simplify the docs.

I think my question is: If I'm using a client that adds some contextual attribute automatically, but I don't list that attribute in my stream config, will things break?

No 🙂

I would say the only concern here should be that agent_client_platform_family is mandatory (event validation will fail if it's not included) and JS client library is not adding it automatically for now. But if we want to change that behaviour soon I guess we could already prepare the documentation considering it as done.

This is good to know. I propose updating the docs to create a space to describe which contextual attributes are added by default by each client. Once the JS client library is updated to add agent_client_platform_family automatically, we can update the docs to remove agent_client_platform_family as being required in the config step.

I'm being bold and moving this back into In Process per the above.

@phuedx @apaskulin I'm sorry! It seems I was wrong again. I decided double check this and I have found https://gitlab.wikimedia.org/repos/data-engineering/metrics-platform/-/blob/main/js/src/ContextController.js?ref_type=heads#L33 where I can see that JS client library does add some agent contextual attributes automatically. In fact, it adds same contextual one as the PHP one: agent_client_platform and agent_client_platform_family. So, the following is what actually happens:

  • Java client library adds all agent contextual attributes
  • PHP and JS client libraries add agent_client_platform and agent_client_platform_family

This was in front of me but I didn't realize. Every time you run submitInteraction those contextual attributes are added to the event regardless the stream configuration you provide.

@apaskulin I also remember we were confused because some stream configurations didn't declare agent_client_platform_family as a contextual attribute. My last comment should explain that. I guess sometimes was added and others not just because but the reality is that we don't need to add it

Thanks for these clarifications, @Sfaci! I've updated the guides that I previously edited, and I created a new page (https://wikitech.wikimedia.org/wiki/Metrics_Platform/Contextual_attributes) that lists the contextual attributes supported by each client, including which are added automatically and which can be enabled in the stream config, along with references to the code. Let me know what you think! Edits welcome

I've reviewed the page and subpages (and taken notes on how to you've organised this kind of information!) and they LGTM. I tweaked wikitech:Metrics_Platform/Contextual_attributes/PHP slightly.

Being bold

@phuedx
I was wondering whether it makes sense to include `agent_app_install_id as an optional contextual attribute for the JS client library. The attribute is defined as valid but, in reality, we are not filling its value right now (at least through the integration we have for EventLogging). Should it be there as an optional contextual attribute?

@phuedx
I was wondering whether it makes sense to include `agent_app_install_id as an optional contextual attribute for the JS client library. The attribute is defined as valid but, in reality, we are not filling its value right now (at least through the integration we have for EventLogging). Should it be there as an optional contextual attribute?

Currently, there's no sensible value for the agent.app_install_id field in the MediaWiki context. I don't think it should be included by default. In should probably be removed from the allowlist but there's no harm in it being there.

Being bold and moving to Done. Feel free to revert if signoff is needed. Any changes to the clients should be handled as separate tasks