Page MenuHomePhabricator

[M] Add 'custommatch' params to commons config for searching media files using wikidata ids
Open, Needs TriagePublic

Description

NOTE: T296814 must be done before this

User story

As a user, I want to be able to get an image suggestion (with a confidence score) for a particular image id.
As a developer, I want to be able to store suggested image data with confidence scores.
To make this possible I need to
a) have appropriate data in the commonswiki search index (T296814 and related tasks)
b) configure commons media search to make use of the new data (this ticket)


Once we have the commons search index weighted_tags field populated with data from wikidata, we need to enable searching for images via wikidata ids. The following will need to be added to commons config

$wgMediaInfoCustomMatchFeature = [
		'depicts_or_linked_from' => [
			'fields' => [
				'statement_keywords' => [
					[ 'prefix' => 'P180=', 'boost' => 0.06800689749434177 ], //depicts
					[ 'prefix' => 'P6243=', 'boost' => 0.0001 ], // digital representation of (arbitrary small value)
				],
				'weighted_tags' => [
					[ 'prefix' => 'image.linked.from.wikidata.p18/', 'boost' => 0.9862993091599952 ],
					[ 'prefix' => 'image.linked.from.wikidata.p373/', 'boost' => 7.190793838918551 ],
					[ 'prefix' => 'image.linked.from.wikidata.sitelink/', 'boost' => 5.031161363459293 ],
				],
			],
			// logistic function
			'functionScore' => [
				'scriptCode' => '100 / ( 1 + exp( -1 * ( _score + intercept ) ) )',
				'params' => [ 'intercept' => -1.3459675572537635 ],
			]
		],
	];

When this is in place, users will be able to search for, for example, images of cats using custommatch:depicts_or_linked_from=Q146

Event Timeline

Cparle updated the task description. (Show Details)
Cparle updated the task description. (Show Details)
CBogen renamed this task from Add 'custommatch' params to commons config for searching media files using wikidata ids to [M] Add 'custommatch' params to commons config for searching media files using wikidata ids.Jan 26 2022, 5:32 PM

We've changed our approach to calculating confidence scores, and are now estimating them before storing image suggestions. This ticket is therefore no longer necessary for image suggestions, as we don't have another use case for getting images for a particular Q-id

It might, however, prove to be useful for some other kind of image search in the future, so not closing