If I search for "dog" on commons here is the query I get
Notice that one of the match queries is on description.en with boost=0.019
If I search for "chó" (the Vietnamese for dog) on commons | here is the query I get
Notice that there is no match query description.vi, because we don't have stemming for Vietnamese. description.vi.plain is present in the query, but its boost is set to zero
This means that search will be less accurate for languages for which we don't have stemming (e.g. Vietnamese, Cebuano, Bengali)
Proposed fix:
- if the boost for a language-aware non-stemming field is zero AND there is no stemmed version of the field, then set its boost to the equivalent value for the stemming field