Based on research in T171652, look at the following in more detail as possible candidates for creating Elasticsearch language analyzer plugins.
|Invalid||None||T174065 [FY 2017-18 Objective] Improve support for searching in multiple languages|
|Open||None||T154511 [Tracking] Research, test, and deploy new language analyzers|
|Resolved||TJones||T171652 Language Analysis Morphological Library Research Spike|
|Open||None||T178923 Review Japanese Morphological Libraries|
- Mentioned In
- T317476: Filter and sort search results of Japanese kana search queries in accordance with how much of the query appears as a consecutive substring
T171652: Language Analysis Morphological Library Research Spike
- Mentioned Here
- T171652: Language Analysis Morphological Library Research Spike
We've moved on to other tasks and aren't spending time looking at morphological libraries these days.
However, I may spend some 10% time reviewing the Kuromoji analyzer for Japanese that is endorsed by Elasticsearch, which we previously decided not to use. There may have been improvements since I last looked at it, and after working on Nori for Korean I have a few new insights into sources of trouble and how to find them (and create custom filters and/or open upstream bugs to fix them), so maybe Kuromoji can pass muster. It would be a lot less work to deploy than finding a third-party analyzer, porting it to an Elasticsearch plugin, and maintaining it.