Page MenuHomePhabricator

Investigate ranking of search results for a multi-lingual caption search
Closed, ResolvedPublic

Description

Investigate how the scoring works, and how it might be tuned so as to get the search results we might reasonably expect to get.

Multi-lingual captions are stored in the opening_text field in elasticsearch

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

Cparle triaged this task as Medium priority.Apr 19 2018, 10:05 AM
Cparle created this task.

Ok here's how to tune the search parameters

Added the following to settings.d/10-cirrus.php

$wgCirrusSearchFullTextQueryBuilderProfile = 'commons_profile';

$wgCirrusSearchFullTextQueryBuilderProfiles['commons_profile'] = [
	'builder_class' => \CirrusSearch\Query\FullTextSimpleMatchQueryBuilder::class,
	'settings' => [
		'default_min_should_match' => '1',
		'default_query_type' => 'most_fields',
		'default_stem_weight' => 3.0,
		'fields' => [
			'title' => 0.3,
			'redirect.title' => [
				'boost' => 0.27,
				'in_dismax' => 'redirects_or_shingles'
			],
			'suggest' => [
				'is_plain' => true,
				'boost' => 0.20,
				'in_dismax' => 'redirects_or_shingles',
			],
			'category' => 0.05,
			'heading' => 0.05,
			'text' => [
				'boost' => 0.6,
				'in_dismax' => 'text_and_opening_text',
			],
			'opening_text' => [
				'boost' => 0.888,
				'in_dismax' => 'text_and_opening_text',
			],
			'auxiliary_text' => 0.05,
			'file_text' => 0.5,
		],
		'phrase_rescore_fields' => [
			// very low (don't forget it's multiplied by 10 by default)
			// Use the all field to avoid loading positions on another field,
			// score is roughly the same when used on text
			'all' => 0.06,
			'all.plain' => 0.1,
		],
	],
];

Something similar would need to be added to wmf-config here https://github.com/wikimedia/operations-mediawiki-config/tree/master/wmf-config to do similar tuning on labs or production