Page MenuHomePhabricator

Investigate enabling Nynorsk Light Stemmer
Open, NormalPublic

Description

While looking into T147959, I noticed that both Bokmål (nb) and Nynorsk (no) are explicitly configured to use the Norwegian language analyzer. Elastic has a Nynorsk light stemmer, which might do better for Nynorsk than the standard Norwegian analysis, which is listed as being for Bokmål.

This would be a test of the differences caused by changing nn.wikipedia.org to the Nynorsk analyzer, probably including speaker review (unless it is obviously horrible). If it looks like an improvement, we would deploy and re-index.

Event Timeline

TJones created this task.Oct 10 2017, 8:00 PM
Restricted Application added subscribers: jeblad, Danmichaelo, jhsoby, Aklapper. · View Herald TranscriptOct 10 2017, 8:00 PM

Changing the Bokmål Wikipedia to use the Nynorsk analyzer is probably a bad idea … :-)

jhsoby updated the task description. (Show Details)Oct 10 2017, 8:05 PM

@jhsoby—Ha! Thanks! Yeah, we want to test nn (Nynorsk) not no (Bokmål)—though presumably if we tested no it would become clear that it was a bad idea.

debt triaged this task as Normal priority.Oct 12 2017, 5:08 PM
debt moved this task from needs triage to This Quarter on the Discovery-Search board.