Page MenuHomePhabricator

EPIC: Test if a reverse field can help to display more and better suggestions
Closed, ResolvedPublic


In order to test if it's worth trying a reverse field to display more and better suggestion here is a method based on Trey's idea to limit the test to only a subset of the data needed for the suggester to run.

The big picture is :

  • Dump data from production (title and redirect)
  • Build an elasticsearch index in lab
  • Run a set of "phrase suggester" queries extracted from search logs against this index
  • Count and measure the results

This task is marked as EPIC because it needs prior work :

  • Add an option to filter a subset of fields to dumpIndex
  • Write a small script that runs phrase suggester queries

If this task is validated I think it will be a nice method to test further enhancements we plan to make to "Did you mean" suggestions.

Event Timeline

dcausse created this task.Jul 30 2015, 2:47 PM
dcausse raised the priority of this task from to Needs Triage.
dcausse updated the task description. (Show Details)
dcausse added subscribers: dcausse, TJones, EBernhardson.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 30 2015, 2:47 PM
dcausse set Security to None.
dcausse added a project: CirrusSearch.
Restricted Application added a project: Discovery. · View Herald TranscriptJul 30 2015, 2:48 PM
Deskana added a subscriber: Deskana.

Thanks for filing this task David! I've put it into the backlog; let's review it in a sprint planning meeting before we pull it into the sprint.

Ironholds moved this task from Needs triage to Search on the Discovery board.Aug 4 2015, 8:17 AM
Deskana moved this task from Search to Product Epics on the Discovery board.Aug 13 2015, 8:47 PM
Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptDec 31 2015, 5:02 AM
Deskana triaged this task as Normal priority.Dec 31 2015, 5:02 AM
Restricted Application added a project: Discovery-Search. · View Herald TranscriptApr 12 2016, 10:13 PM

@dcausse Did we do this? If not, do you think it's still worth doing at some point? I'm not sure how to prioritise it.

dcausse closed this task as Resolved.Dec 16 2016, 10:15 AM
dcausse claimed this task.

@Deskana yes it was tested as part the BM25 a/b test, this was the Track typos in first 2 characters bucket. Unfortunaly the analysis concluded that it has a bad impact on CTR. In the end we still do not have a sane solution to track typos in the first 2 chars, we could maybe try reducing the prefix length to 1... we were concerned by server load at that time but with the new nodes we plan to add we could maybe reevaluate our position.