Search Metrics - Successful searches
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Gehel
	Feb 23 2024, 3:27 PM

Description

See parent task for details.

We’d like to know how many users find the information they need from our search. The definition of success is probably different between autocomplete / go box and full text search.

This should only use high confidence signals of a satisfied search.

Details

	Title	Reference	Author	Source Branch	Dest Branch
	Cirrus metrics calculations	repos/search-platform/notebooks!4	ebernhardson	search-metrics	main

Customize query in GitLab

Related Objects
Search...

		Status	Subtype	Assigned	Task
		Open		None	T358345 [Epic] Search metrics 2024
		Resolved		EBernhardson	T358350 Search Metrics - Successful searches

Event Timeline

Gehel created this task.Feb 23 2024, 3:27 PM

dr0ptp4kt triaged this task as High priority.Apr 8 2024, 3:43 PM

dr0ptp4kt edited projects, added Discovery-Search (Current work); removed Discovery-Search.Apr 8 2024, 3:46 PM

Historically this was based on dwell time as a satisfied search. Plan would be to re-use that metrics if the source data points still hold.

dr0ptp4kt updated the task description. (Show Details)Apr 12 2024, 8:10 PM

@EBernhardson I updated the AC to indicate that this should only be specified where there is high confidence signaling.

(Updated previous comment. Do this in conjunction with the other tickets, not necessarily afterward.)

dr0ptp4kt mentioned this in T358345: [Epic] Search metrics 2024.Apr 12 2024, 8:32 PM

Gehel moved this task from Incoming to Ready for Dev -- SWE on the Discovery-Search (Current work) board.Apr 15 2024, 3:38 PM

Started looking over this the other day. Some data we have available:

We can do survival analysis on fulltext clickthroughs. Historically we used 10s of dwell time as a "success" metric. Looking at a single day of search satisfaction data it finds ~81% of fulltext clickthroughs are satisfied, and on a per-session basis ~87% of sessions with a fulltext search clickthrough had at least one clickthrough with a dwell >= 10s. This could probably be improved by sourcing the number of sessions doing fulltext searches to use as the denominator, instead of the number of page views clicking through but not being satisfied (essentially we are excluding abandonded searches here, which we know are a decent %).

Autocomplete should have the same dwell time information, although i haven't had a chance to look at it yet. Coming up next.

Both of these feel slightly awkward as measures of search satisfaction, but we don't have anything better currently. I tried to look into how the our old user satisfaction KPI worked but i haven't found the exact code that implemented it yet. It was trying to blend a few different factors, so i think I would avoid trying to reuse that definition anyways and stick with something simple.

I've worked through most of this and have it calculating up the last two months of metrics now. They will be found, for now, in ebernhardson.T358350 in hive.

First some limitations:

We only have this information for desktop search, tracked by the SearchSatisfaction schema.

A few definitions i ended up using:

An autocomplete session is constrained to a single page load.
An autocomplete session is satisfied when the user selects and submits an autocomplete provided completion
- The user manually typing an autocomplete provided result doesn't count.
- A more generous definition would be based on if the user got a 'go' result, but we don't track that in this schema. We do know from overall request metrics that Special:Search issues ~75% redirects to users.
An autocomplete session is dissatisfied when the user does not submit any search query after seeing autocomplete results. This is perhaps too generous, but declaring all users sent to fulltext search as dissatisfied with autocomplete also seemed incorrect.

A fulltext search session uses the schemas definition. This resets the session after 10 minutes of not performing any search requests.
A satisfied fulltext clickthrough is defined as visiting a page and having a dwell time of at least 10 seconds. This is the threshold we used in the past, and was re-confirmed by bruno last summer: "The distribution of dwell times, the time spent in the answer page, has two modes (Figure 1), with the first one in less than a second and the second in around 7 seconds. ". https://wikiworkshop.org/2023/papers/WikiWorkshop2023_paper_37.pdf
- Perhaps more granular dwells would be useful, i am using the very simplistic definition of dwell time via the checkin events currently which doesn't provide more granularity than >= 10.
A satisfied fulltext session is defined as a session with at least one satisfied clickthrough
A dissatisfied fulltext session is defined as a session with no satisfied clickthroughs
A fulltext session is abandoned if they saw a result page and clicked no links.

The top-level metrics aren't particularly rosy, but align with what we've seen in the past.