Having experimented with some different approaches, we've come to the view that a slightly tweaked convention for analytics schema will help us grow and evolve our capabilities more smoothly as we transition into the new platform. These changes build on prior work focused on defining individual fragments and formulating a common schema, and ties them together in a way that should make our abstractions more clear and our conventions more clean.
Today, our general convention has been that:
- every analytics schema must include a schema fragment called 'analytics/common'
- an analytics schema may include a number of individual schema fragments for particular sets of fields.
Our updated convention will be:
- Every analytics stream must either use a schema called analytics/base
- Or use a schema that extends analytics/base with additional top-level fields.
This new analytics base schema does a couple of things:
- Includes a schema fragment called fragment/analytics/dimensions which contains all the standard 'dimension' fields that can be collected by the client library.
- Contains a field for the event topic
- Contains a field for the event action
- Contains a multipurpose string:string map field for additional data
The concept of an event topic is designed to replace the idea of instrumentation producing events to a particular stream. Our system allows multiple streams to subscribe to the same events, and we have made this explicit by recognizing that these events are not being sent to streams, rather the streams are subscribing to topics, and receiving the events for those topics.
The concept of an event action is the same as its historical usage. It is convenient to group related events together, especially when they are different steps in the same funnel or workflow, or components of the same product feature. This is what the action field is for. We considered eliminating it entirely and having all events be their own topic, but were convinced that it was better to leave this pattern in place.
Streams will be able to
- Subscribe to topics (or certain actions in a topic), in order to pick out events of interest and receive them into a database table named after the stream.
- Specify which of the dimensions from the dimensions fragment will be filled out by the client library.
The calling interface will not change, but what today we call the "stream name" will become the "topic",
mw.eventLog.submit( 'my_simple_topic', { message: 'Hello!' } );
And we will support an optional argument with the name of the "action", when that is to be used.
mw.eventLog.submit( 'my_action_topic', 'my_action', { message: 'Hello!' } );