fulltext matches & statement matches should both influence "how relevant" a file is for a search term, and we should try to find a good balance where having matching depicts statements significantly impacts the score, without overpowering the fulltext scores.
Short example:
- a file with depicts:cat is probably a better match than a file with only a mention of cat somewhere in the description ("Cat" could be the photographer's first name)
- a file with multiple mentions of cat all over the place (title, description, caption, ...) but not depicts:cat, is probably more relevant still than one that only has depicts:cat
Things that make this hard:
- there is no consistency in scores across searches: it all depends on the frequency of search terms within the documents individually and as a whole
- there is no consistency in full text scores: a search term consisting of multiple words will lead to bigger scores
- there is no consistency between full text & statement scores: while full text scores will grow with more terms, a statement is always just 1 term
I believe we need to figure out a way to normalize full text scores & statement scores to a similar baseline (though we can't use either score as a baseline, because documents may exist where only fulltext matches, or only statements match)
We can weight specific fields/queries relative to one another, but that doesn't help much until we contain their range (which varies based on search term input, which is beyond our control)