Page MenuHomePhabricator

Search Metrics - Successful searches
Open, HighPublic

Description

See parent task for details.

We’d like to know how many users find the information they need from our search. The definition of success is probably different between autocomplete / go box and full text search.

This should only use high confidence signals of a satisfied search.

Details

TitleReferenceAuthorSource BranchDest Branch
Cirrus metrics calculationsrepos/search-platform/notebooks!4ebernhardsonsearch-metricsmain
Customize query in GitLab

Event Timeline

Historically this was based on dwell time as a satisfied search. Plan would be to re-use that metrics if the source data points still hold.

@EBernhardson I updated the AC to indicate that this should only be specified where there is high confidence signaling.

(Updated previous comment. Do this in conjunction with the other tickets, not necessarily afterward.)

Started looking over this the other day. Some data we have available:

  • We can do survival analysis on fulltext clickthroughs. Historically we used 10s of dwell time as a "success" metric. Looking at a single day of search satisfaction data it finds ~81% of fulltext clickthroughs are satisfied, and on a per-session basis ~87% of sessions with a fulltext search clickthrough had at least one clickthrough with a dwell >= 10s. This could probably be improved by sourcing the number of sessions doing fulltext searches to use as the denominator, instead of the number of page views clicking through but not being satisfied (essentially we are excluding abandonded searches here, which we know are a decent %).
  • Autocomplete should have the same dwell time information, although i haven't had a chance to look at it yet. Coming up next.

Both of these feel slightly awkward as measures of search satisfaction, but we don't have anything better currently. I tried to look into how the our old user satisfaction KPI worked but i haven't found the exact code that implemented it yet. It was trying to blend a few different factors, so i think I would avoid trying to reuse that definition anyways and stick with something simple.

I've worked through most of this and have it calculating up the last two months of metrics now. They will be found, for now, in ebernhardson.T358350 in hive.

First some limitations:

  • We only have this information for desktop search, tracked by the SearchSatisfaction schema.

A few definitions i ended up using:

  • An autocomplete session is constrained to a single page load.
  • An autocomplete session is satisfied when the user selects and submits an autocomplete provided completion
    • The user manually typing an autocomplete provided result doesn't count.
    • A more generous definition would be based on if the user got a 'go' result, but we don't track that in this schema. We do know from overall request metrics that Special:Search issues ~75% redirects to users.
  • An autocomplete session is dissatisfied when the user does not submit any search query after seeing autocomplete results. This is perhaps too generous, but declaring all users sent to fulltext search as dissatisfied with autocomplete also seemed incorrect.
  • A fulltext search session uses the schemas definition. This resets the session after 10 minutes of not performing any search requests.
  • A satisfied fulltext clickthrough is defined as visiting a page and having a dwell time of at least 10 seconds. This is the threshold we used in the past, and was re-confirmed by bruno last summer: "The distribution of dwell times, the time spent in the answer page, has two modes (Figure 1), with the first one in less than a second and the second in around 7 seconds. ". https://wikiworkshop.org/2023/papers/WikiWorkshop2023_paper_37.pdf
    • Perhaps more granular dwells would be useful, i am using the very simplistic definition of dwell time via the checkin events currently which doesn't provide more granularity than >= 10.
  • A satisfied fulltext session is defined as a session with at least one satisfied clickthrough
  • A dissatisfied fulltext session is defined as a session with no satisfied clickthroughs
  • A fulltext session is abandoned if they saw a result page and clicked no links.

The top-level metrics aren't particularly rosy, but align with what we've seen in the past.

fulltext session sat37.6%
fulltext session dsat62.4%
autocomplete session sat57.1%
autocomplete session dsat3.8%

And a few related metrics:

fulltext session abandon55.6%
autocomplete session submit96.2%

These can be further broken down, the dimensions available are:

  • site (project/language/domain)
  • os_family
  • country
  • browser_family
  • user_edit_bucket

Four tickets were combined into a single ticket, two calculations, and found in the patch above:

  • T358349 - number of searches
  • T358350 - successfull searches
  • T358351 - read traffic generated by search
  • T358352 - number of user sessions using search