Page MenuHomePhabricator

EPIC: Test if a reverse field can help to display more and better suggestions
Closed, ResolvedPublic

Description

In order to test if it's worth trying a reverse field to display more and better suggestion here is a method based on Trey's idea to limit the test to only a subset of the data needed for the suggester to run.

The big picture is :

  • Dump data from production (title and redirect)
  • Build an elasticsearch index in lab
  • Run a set of "phrase suggester" queries extracted from search logs against this index
  • Count and measure the results

This task is marked as EPIC because it needs prior work :

  • Add an option to filter a subset of fields to dumpIndex
  • Write a small script that runs phrase suggester queries

If this task is validated I think it will be a nice method to test further enhancements we plan to make to "Did you mean" suggestions.

Event Timeline

dcausse raised the priority of this task from to Needs Triage.
dcausse updated the task description. (Show Details)
dcausse added subscribers: dcausse, TJones, EBernhardson.
dcausse set Security to None.
dcausse added a project: CirrusSearch.
Deskana subscribed.

Thanks for filing this task David! I've put it into the backlog; let's review it in a sprint planning meeting before we pull it into the sprint.

@dcausse Did we do this? If not, do you think it's still worth doing at some point? I'm not sure how to prioritise it.

dcausse claimed this task.

@Deskana yes it was tested as part the BM25 a/b test, this was the Track typos in first 2 characters bucket. Unfortunaly the analysis concluded that it has a bad impact on CTR. In the end we still do not have a sane solution to track typos in the first 2 chars, we could maybe try reducing the prefix length to 1... we were concerned by server load at that time but with the new nodes we plan to add we could maybe reevaluate our position.