Page MenuHomePhabricator

Literature review for query completion
Closed, ResolvedPublic

Description

Goals:

  • Identify metrics we should use for offline and online evaluation of autocomplete. Previously when tuning autocomplete for wikidata we have use Mean Reciprocal Rate (MRR) and a metric we invented that estimates the number of characters that need to be typed for each user query.
  • Identify potential ranking methods. Previous work on wikidata used Most Popular Completion (MPC) as a baseline to compare against, but there are likely better options. If we use MPC, we need to identify appropriate ways to balance recent popularity vs long term popularity. It we use something else it should be relatively simple. For example personalization, even limited to single session context, is likely more complex than we should initially consider.
  • Identify potential methods for deciding the set of queries that can be "released" as query completions. While we have not released query sets before, we expect the general concept to be around identifying repeated queries from multiple users.

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedEBernhardson

Event Timeline

A useful survey on query completion out of the univerity of amsterdam in 2016: https://pure.uva.nl/ws/files/2716800/176802_fntir2016_qac.pdf

Time-Sensitive Query Auto-Completion: https://dl.acm.org/doi/pdf/10.1145/2348283.2348364

  • From 2012, proposes using time-series predictions instead of raw popularity. Seems well cited and accepted by later work

Learning User Reformulation Behavior for QueryAuto-Completion: https://dl.acm.org/doi/pdf/10.1145/2600428.2609614

  • Proposes exploting query context such as previous queries and click-through data in the same session.
  • Could be a possible way forward, but seems more complex than we might want for a first implementation.

User Model-based Metrics for Offline Query SuggestionEvaluation: https://dl.acm.org/doi/pdf/10.1145/2484028.2484041

  • Proposes pSaved and eSaved metrics for evaluating autocomplete
  • Directly models user satisfaction, which other autocomplete metrics do not

Online Spelling Correction for Query Completion : https://dl.acm.org/doi/pdf/10.1145/1963405.1963425

  • In addition to the primary focus of the work they propose a new autocomplete metric, Minimal Keystrokes (MKS)
  • Counts the total number of keystrokes, including arrow keys to select a completion and enter key to submit, on a per-search basis
  • In this particular paper: "we consider superstrings of the target query as positive matches, too. That is, in the case that “important people” is suggested instead of “important”, we still treat it as a match"

https://0x65.dev/blog/2019-12-08/how-do-you-spell-boscodictiasaur.html

  • While not strictly about completion search, this blog post from the (recent deceased) cliqz search engine goes into some detail about how they use query-count indices to inform their spelling correction. This provides a reasonably strong argument that full text search, spelling correction, and completion search are all heavily inter-related. In particular they treat all incoming search queries, completions or full text, as partial search strings.