Page MenuHomePhabricator

[M] Tweak impact of quality assessments on results score
Closed, ResolvedPublic


Quality assessment templates are already being used to boost certain results more than others.
Let's make sure those boosts make sense compared to the rest of the scoring.

Event Timeline

CBogen renamed this task from Include quality assessments to tweak rank of results to [M] Include quality assessments to tweak rank of results.Jul 15 2020, 4:31 PM
CBogen updated the task description. (Show Details)

Note: I believe (some of) this already exists via Cirrus boosting usage of certain templates/categories, in which case we might simply be able to use that.

Yeah, I guess we don't need this any longer then.

Reopening to investigate the impact of those assessments, since I think they're currently have too much power, boosting result scores by up to 9.84375 multiplication.
A very good example where this is unwanted is a search for "king" - all of the top results got pushed there because they have positive assessments, but they're quite irrelevant.
Images include:
#1 image of a queen bee, because it has depicts:monarch (Q118) - a good result, but shouldn't be #1; simply has a massive boost
#3 Martin Luther King Kr - makes sense to include, but should score lower than at least a few monarchs
#4 King's Cave - same as #2
#5 description contains "painter of the King" - should score lower
#6 and most irrelevants after that: pictures by User:King_of_Hearts - ideally shouldn't even be included, but can't omit it if part of the description... should score much, much lower, though

If I exclude assessments, the top match is (which is a really good match - by title, statement, caption & description)
With current assessment boosts, 35 other poorer matches outscore it...

We should probably lower the impact of assessments. While a good signal for image quality (which certainly deserves a boost), it currently comes at the cost of significantly decreased relevance.

matthiasmullie renamed this task from [M] Include quality assessments to tweak rank of results to [M] Tweak impact of quality assessments on results score.Oct 2 2020, 1:30 PM

AIUI, the template boosts are defined here:
On their own, those numbers make (somewhat) sense (though I'd still be inclined to tweak them a little based on usage patterns, see below)
But: all those boosts are currently conditions of a bool query, so they're all getting multiplied (if an image has multiple assessments)
And that's exactly what we have:

  • Template:Picture_of_the_day images are chosen our of Template:Assessments/commons/featured files
  • Template:Assesments could include all, but definitely always includes Template:Assessments/commons/featured files (no community consensus for the other categories)

Every "picture of the day" image will get a 5.625 boost (POTD + featured + assessment)
And there is a pretty good amount of overlap between all these pictures in the first place, so if it's also a Valued and Quality image, it becomes a whopping 17.22... boost.
This basically means that such images easily rank first for any vaguely matching search term (see earlier example) - they massively overpower any other score.

Potential solutions:

  1. Use dis_max instead of a bool query, that would stop the compounding effect (but could be problematic for other wikis?)
  2. Lower the scores significantly
    1. Onwiki? (is there an established process for editing
    2. Ignore wiki config via $wgCirrusSearchIgnoreOnWikiBoostTemplates (as done for enwiki) and override numbers in config?

FYI: here are the current usage patterns:

Template# current files with template

I'd propose these change the boosts for all of these templates to 1.25 (if we stick with multiplying them in a bool query) to keep the boosts somewhat reasonable when they're combined.

In isolation, that would basically mean that they'd already get these boost:

Featured (because also assessment by definitely)1.56
POTD (because also featured + assessment by definition)1.95

Combined, they could still rise to 3 (still a significant boost, but hey...)

Thoughts? How to proceed to change these boosts?

If the problem to solve is related to pages being tagged with more than one of these templates I'd suggest the simple approach you suggested (dismax) but setting score_mode = max in includes/Search/Rescore/BoostTemplatesFunctionScoreBuilder.php. Template boosting is rarely used and I'm sure most of the time they have been adjusted with only one matching template in mind and I'm sure this change would benefit the rare other wikis using this feature.
If the problem is more regaining control over template boosting because the way they're applied is not compatible with the ranking formula being implemented I'd suggest setting a dedicated rescore profile, this will give more flexibility to tune these settings. Issue being that wgCirrusSearchBoostTemplates and wgCirrusSearchIgnoreOnWikiBoostTemplates are global to all query builders.

Slightly related and might perhaps be worth reconsidering is T202339.

Using max would probably suffice here - will do that tomorrow.

If we wanted to actually change the scores currently defined on-wiki (just minor tweaks), how would we go about this?
Can we just go change those on-wiki? Do we need to consult anyone? Something else?

I don't think there exists a formal process to change these values on wiki. My experience around these values have been:

  • disable them on enwiki through wgCirrusSearchIgnoreOnWikiBoostTemplates because they were incompatible with the switch to BM25, at the time only the original author of CirrusSearch had set them there so as a CirrusSearch maintainer I took the liberty to disable them
  • on wikitech these values are actively maintained by wiki admins

On commons looking at history it was changed twice but since it requires some special perms and I think that changing this on commons would require asking an admin to do the operation.

Change 632435 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[mediawiki/extensions/CirrusSearch@master] Multiply score only by best matching template boost, not all

Alright thanks. I have a patch up to simply use the max score, which would resolve the problem in this ticket.

As for slightly tweaking the scores so that they make more sense, I've asked the person who made the previous 2 changes on-wiki to make sure there's no existing community involvement or reasons for the existing scores that we're not aware of.

Change 632435 abandoned by Matthias Mullie:
[mediawiki/extensions/CirrusSearch@master] Multiply score only by best matching template boost, not all

This isn't working out, going forward with a different solution

Change 632780 had a related patch set uploaded (by Matthias Mullie; owner: Matthias Mullie):
[mediawiki/extensions/WikibaseMediaInfo@master] Register custom rescore profile to limit impact of templates

Change 632780 merged by jenkins-bot:
[mediawiki/extensions/WikibaseMediaInfo@master] Register custom rescore profile to limit impact of templates

Patch (to only use max score instead of allowing them to combine to massive scores) merged.
On-wiki change of boosts also done.

Etonkovidova claimed this task.
Etonkovidova added a subscriber: Etonkovidova.

Checked in commons wmf.21 (it was not practical to check in betalabs)
function_score doesn't show values over 2, e.g.

description: "full_text search for 'king'"
path: "commonswiki/page/_search"

Quality image1.25
Valued image1.5
Picture of the day2