Page MenuHomePhabricator

Searching in non-English languages doesn't return expected results
Closed, ResolvedPublic

Description

For example

https://commons.wikimedia.org/w/index.php?search=chien&title=Special:MediaSearch&go=Go&type=image&uselang=fr

contains a lot of images with 'chien' in the title before we get any images of dogs, despite the uselang=fr param. statement matches are rated more highly than title matches, so that's unexpected

Event Timeline

Change 690403 had a related patch set uploaded (by Cparle; author: Cparle):

[mediawiki/extensions/WikibaseMediaInfo@master] Remove 'wikibase' profile, add lang to cache key

https://gerrit.wikimedia.org/r/690403

Comparison of results from the analysis tool before and after this change

before change:

F1 Score      | 0.59321010027788
Precision@10  | 0.82402707275804
Precision@25  | 0.79368213228036
Precision@50  | 0.76342624065262
Precision@100 | 0.72581429265921
Recall        | 0.53000863557858

after change:

F1 Score      | 0.59456193353474
Precision@10  | 0.82993197278912
Precision@25  | 0.79860418743769
Precision@50  | 0.76543209876543
Precision@100 | 0.72519455252918
Recall        | 0.53108808290155

... so all scores are slightly improved

Change 690403 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@master] Remove 'wikibase' profile, add lang to cache key

https://gerrit.wikimedia.org/r/690403

Etonkovidova added a subscriber: Etonkovidova.

For comparative testing on commons wmf.9

The improvement is noticeable for the search term chien when French language is set for the UI language; all other cases were checked to see if there are no regression - all seems to be good.