As part of the work on T258183, we need to specify what exactly we will be measuring.
For now it seems like developing a new schema is the best way to move forward. I've followed the guidelines for the new Event Platform and created a basic media-search specific schema that used the Media Search Measurement Specifications document as a starting point. For now, the goal is to capture more high-level user actions as opposed to recording every click, mouse movement, etc.
Required event properties
Our schema describes the event object that is recorded whenever we log something. This object has various properties. Currently every event will include the following properties:
- session_id: generated using mw.user.generateRandomSessionId when the page JS loads. Multiple searches performed in the same browser tab by the same user in a single sitting can be associated using this field. If the user does a hard-refresh of the page or opens a new tab later, they will get a new session ID.
- skin: Every event will include a "skin" property with a value of "vector", "minerva", etc.
- language_code: Commons only has one content language, but we would probably like to know if the user has specified an interface language. Every event will include the first item in the user's language fallback chain as a code ("de", "es", etc).
- Basic event metadata (datetime and timezone info, the schema to be used, etc) is always included
- action: Like the SearchSatisfaction schema, our schema will assign an "action" property to every event representing what the user was actually doing on the page. The action must belong to a pre-defined list, and additional event properties may be included based on the action. The actions we currently support are listed below.
Current draft schema action types:
Action | Description | Additional properties |
---|---|---|
search_new | User performs a new search for a given query | query, total result count, media type |
search_load_more | User "continues" an existing search, either within the same tab (by scrolling down the page) or on a new tab | query, total result count, media type |
search_clear | User clicks the "X" button and clears the query along with all results in all tabs | |
tab_change | User changes tabs (which correspond to file types) without changing the search query or clearing existing results | query, total result count, media type |
result_click | User clicks on a result within a given tab. This may or may not trigger a quickview, depending on the media type. | result pageid, media type, position (one-dimensional index value), whether or not quickview will be shown |
quickview_hide | User dismisses quickview using the "X" button or keyboard | |
quickview_more_details_click | User clicks the "more details" button, actually navigating to the result page | pageid of result |
Still to do
- Actions representing user interactions with the Quickview audio/video player
- Action representing use of the upcoming "copy text" button in the Quickview
- Actions representing user interactions with the "concept chips" elements
- Action representing user interactions with the search filter settings (this starts a new search under the hood, do we want to log it as such?)
- A property representing whether the user entered their search term through typing or through selecting an auto-complete suggestion?
- A "check-in" action that is automatically logged every X amount of time (how long?)
Some of these will need to wait until the relevant feature is ready for instrumentation; what should be considered the minimum acceptance criteria for this particular task? We can incrementally add things to the schema in the future by updating it to a new version in a backwards-compatible way.