Page MenuHomePhabricator

Build out an API that exposes ElasticSearch suggester results for a given query
Closed, ResolvedPublic

Description

For the first prototype I propose to :

  1. Design a script similar to inplace reindex that builds the completion suggester index
  2. Write a scoring function that takes only the data available in cirrus today + the geo tags availables in the db
  3. Implement about 4 suggestions at the same time (exact, stopwords, fuzzy, fuzzy+stopwords)
  4. Expose an API

Further enhancements could be:

  • Use T44259 to include a new score component that reflects the popularity of a page
  • Think of how we could include some basic NLP functions to help name suggestions (this will require some studies on the current architecture, where do we put NLP analysis in the current process?)

Event Timeline

Deskana created this task.Jul 13 2015, 9:26 PM
Deskana raised the priority of this task from to Normal.
Deskana updated the task description. (Show Details)
Deskana added a project: Discovery.
Deskana added a subscriber: Deskana.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 13 2015, 9:26 PM
dcausse claimed this task.Jul 17 2015, 7:50 AM
dcausse updated the task description. (Show Details)
dcausse set Security to None.
dcausse updated the task description. (Show Details)Jul 17 2015, 7:58 AM
TJones added a subscriber: TJones.Jul 22 2015, 3:33 PM
dcausse removed dcausse as the assignee of this task.Aug 7 2015, 1:44 PM
Deskana closed this task as Resolved.Aug 31 2015, 6:18 PM

Well, the API's there now, so this is resolved.

https://en.wikipedia.org/w/api.php?action=cirrus-suggest&text=example

It doesn't quite work yet though because the index hasn't been built. T110922 is for that.

Some initial tests of the efficacy of the suggestion API were done in T109729.