Page MenuHomePhabricator

Take another look at weighting of languages for search results
Closed, DeclinedPublic

Description

Based on the update that was accomplished with T68829, we should take another look at this comment:

We've noticed that the weight on incoming_links that can, in some cases, overweight the language based rescore. So for the example query:
https://www.mediawiki.org/w/index.php?title=Special%3ASearch&profile=default&search=throttle+prefix%3AManual%3APywikibot&fulltext=Search&uselang=it&cirrusDumpResult&cirrusExplain=pretty
Pywikibot/Global Options

text based scoring: 0.05815
incoming links weight: 2.576
language:en weight: 2.5
final score: 0.37453112

Pywikibot/Global Options/it

text based scoring:  0.05815
incoming links weight: 0.90309
language:it weight: 5.0
final score: 0.26257026

Event Timeline

debt created this task.Oct 14 2016, 5:27 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 14 2016, 5:27 PM

Some of this will change soon with BM25, since the weights (like on incoming links) will be invalidated. This can be reviewed after than.

Also, this only affects mediawiki.org, so this isn't super pressing in terms of priority.

Nemo_bis added a comment.EditedOct 30 2016, 8:00 PM

This report is a duplicate of T56832 btw. I mean: we could try to split this report into subproblems, if we don't want a single very generic one.

debt closed this task as Declined.Oct 26 2017, 5:12 PM

We don't think that the machine learning that we're doing on search would affect this - mostly because there isn't enough data on mediawiki—declining this ticket.