For the first prototype I propose to :
- Design a script similar to inplace reindex that builds the completion suggester index
- Write a scoring function that takes only the data available in cirrus today + the geo tags availables in the db
- Implement about 4 suggestions at the same time (exact, stopwords, fuzzy, fuzzy+stopwords)
- Expose an API
Further enhancements could be:
- Use T44259 to include a new score component that reflects the popularity of a page
- Think of how we could include some basic NLP functions to help name suggestions (this will require some studies on the current architecture, where do we put NLP analysis in the current process?)