During testing the ES5 plugin, I found a regression and a couple of obvious places for improvement.
- The strings j, n, and jn have the suffixes j, n, and jn removed, leaving an empty string! The command line stemmer didn't do that. Oops.
- Some obviously non-Esperanto words are having j, n, and jn removed (mostly n), like barn, mann, heyn, djerdj, etc. j, n, and jn generally follow a vowel in Esperanto words.
- Numerals should be inflected with a dash, (e.g., 1-oj, 1-a, etc.), but are not always, so we get 1a, 1960j, 1980an, etc. Those are easy to recognize, so we should do the right thing.