We have started to collect click logs for wikidata item autocomplete, but are not 100% sure how to utilize them. Will need to review relevant autocomplete literature to see how this has been tackled before. There are two main goals for this data:
- Perform offline evaluation of an autocomplete algorithm to be compared to the current production ranker
- As input data to a learning algorithm. Potentially the tensorflow based elasticsearch query optimizer, but also potentially things that are revealed through literature review.
Once literature review is complete and multiple options are identified we will build one or more of the systems. For this epic the offline evaluation is to be fully implemented, and the data inputs for the learning algorithm should be generated (but not necessarily implementing the full learning pipeline).