Page MenuHomePhabricator

[M] Create new search profile for commons that uses weighted_tags
Closed, ResolvedPublic

Description

NOTE: T286562 must be done first

Create a new search profile in WikibaseMediaInfo that uses the new weighted_tags data added in T286562

Probably the best approach will be a similar way to how we do a statement_keywords search atm - i.e. query wikidata for matches for the search string, and then add a bunch of Match queries against weighted_tags using the returned item ids

The profile should be triggerable via a url param

Event Timeline

CBogen renamed this task from Create new search profile for commons that uses weighted_fields to [M] Create new search profile for commons that uses weighted_fields.Jul 14 2021, 4:44 PM
Cparle renamed this task from [M] Create new search profile for commons that uses weighted_fields to [M] Create new search profile for commons that uses weighted_tags.Sep 29 2021, 2:23 PM
Cparle updated the task description. (Show Details)

Change 732956 had a related patch set uploaded (by Cparle; author: Cparle):

[mediawiki/extensions/WikibaseMediaInfo@master] Add weighted queries for mediasearch

https://gerrit.wikimedia.org/r/732956

Note that the score returned from this query can be used as a confidence score. Here are the scores for the labeled data we have. Scores below 40 tend to overestimate the likelihood that an image is good, but it's unlikely anyone will be interested seeing images where we think there's a <40% chance of it being a good match, so IMO this confidence score is usable

Score rangeproportion of images in score range that are good
0-200.0704
20-300.1673
30-400.2635
40-500.4068
50-600.5278
60-700.6628
70-800.7358
80-900.8989
90-1000.9512

Change 732956 merged by jenkins-bot:

[mediawiki/extensions/WikibaseMediaInfo@master] Add weighted_tags queries for mediasearch

https://gerrit.wikimedia.org/r/732956

This can't be tested until T296814 is done, so resolving