Page MenuHomePhabricator

[M] Instrument concept chips in MediaSearch
Open, Needs TriagePublic

Description

T258183 will instrument the MediaSearch results page for measurement, but it will not include the concept chip functionality in T256431 because that is coming later. Once T256431 is done, this ticket is to instrument the following using the same method in T258183:

  • measure when a user clicks on a concept chip

Event Timeline

CBogen renamed this task from Instrument concept chips in MediaSearch to [M] Instrument concept chips in MediaSearch.Sep 30 2020, 4:21 PM

I assume we probably want to instrument the autocomplete suggestions as well at some point.

Since clicking either a concept chip or an autocomplete suggestion both ultimately lead to a new search being performed, I think we could instrument both activities the same way: we could add some kind of property to search_new events that indicates how the search originated( source: autocomplete or source: concept_chip for example). This would make it easy to compare all new searches that users initiate and break them down into direct entry vs. autocomplete suggestions vs. concept chip suggestions.

Now that we have a schema for mediasearch_interaction, we'll need to update it to account for concept-chip-related actions. This means we'll need to update the schema. My question is, how do we version such an update? 1.1? 2.0? @nettrom_WMF any thoughts?

Right now the only action you can do with concept chips is to click on them. So we'll need to add a new type of action called concept_chip_click or similar.

In addition to logging that a concept chip has been clicked, I assume we'll want to log the term that it corresponds to (the "concept", which is just a string term name). Do we want to log other data here?

For example, when capturing search actions we could log whether or not concept chips are enabled; when they are, we could log how many get shown. I assume it may be useful to know things like "users tend to click on these more when there are at least 4 of them", etc. Knowing what percentage of enabled searches get chips at all likewise seems useful.

Are there any other bits of information related to concept chips that we should seek to capture?

Now that we have a schema for mediasearch_interaction, we'll need to update it to account for concept-chip-related actions. This means we'll need to update the schema. My question is, how do we version such an update? 1.1? 2.0? @nettrom_WMF any thoughts?

I think bumping it to 1.1.0 makes sense as we'll be adding a few things rather than substantially rewriting the schema. I see that SearchSatisfaction is now on version 1.3.0 after they added another field, for example.

Right now the only action you can do with concept chips is to click on them. So we'll need to add a new type of action called concept_chip_click or similar.

concept_chip_click sounds like a good name for that action!

In addition to logging that a concept chip has been clicked, I assume we'll want to log the term that it corresponds to (the "concept", which is just a string term name). Do we want to log other data here?

For example, when capturing search actions we could log whether or not concept chips are enabled; when they are, we could log how many get shown. I assume it may be useful to know things like "users tend to click on these more when there are at least 4 of them", etc. Knowing what percentage of enabled searches get chips at all likewise seems useful.

Are there any other bits of information related to concept chips that we should seek to capture?

Capturing the number of concept chips shown with a search result makes sense, so we can understand how often these show up, e.g. I can for example see it be an issue if only a few searches have them. Similarly, there might be a threshold for clicking on them, as you mention. I see we have search_result_count in the schema. Maybe search_concept_count would work, although I'm not completely happy with that one?

I agree that it'll be useful to know what the concept was. Are those freeform strings, or do they map to Wikidata items? One thing to keep in mind for us is that if there's a language agnostic way to store these, then we'd like to use that. In the Growth team, we've been storing help links by their symbolic name so we can easily compare them across languages.

Also, I think it'll be meaningful to capture the position of the concept in the list of concepts. I suspect there's going to be bias towards the front of the list, but maybe we'll find that users click a different position and want to investigate.

Lastly, I like the idea of capturing the source of a search, which I know SearchSatisfaction does too. Do we have three possible sources? 1) The user types in something and clicks "Search", 2) the user starts typing something then autocomplete suggests something and they click it, and 3) user clicks on a concept chip?

I noticed that @CBogen asked about the things we want to measure regarding concept chips in the measurement specification, so I'm bringing that back up here so she can chime in too. Maybe @Ramsey-WMF has thoughts as well?

These are all good questions and we should pick this discussion back up later, but for now I'm moving this back to the backlog, since concept chips are on hold until we have a better solution for performant calling on Wikidata, and while we focus on image recommendations.