Page MenuHomePhabricator

Review Japanese Morphological Libraries
Open, MediumPublic

Description

Based on research in T171652, look at the following in more detail as possible candidates for creating Elasticsearch language analyzer plugins.

Event Timeline

We've moved on to other tasks and aren't spending time looking at morphological libraries these days.

However, I may spend some 10% time reviewing the Kuromoji analyzer for Japanese that is endorsed by Elasticsearch, which we previously decided not to use. There may have been improvements since I last looked at it, and after working on Nori for Korean I have a few new insights into sources of trouble and how to find them (and create custom filters and/or open upstream bugs to fix them), so maybe Kuromoji can pass muster. It would be a lot less work to deploy than finding a third-party analyzer, porting it to an Elasticsearch plugin, and maintaining it.