On the cloudelastic elasticsearch replica we now have a learning-to-rank model trained using labeled image data we've collected (see https://phabricator.wikimedia.org/T271803#6823874)
The next step is to write a new search profile in WikibaseMediaInfo, activated by a url param, that uses the model. See https://elasticsearch-learning-to-rank.readthedocs.io/en/latest/searching-with-your-model.html for how to go about searching using a model
We probably ought not merge the profile for now (because the model is not on production), but test it by setting up a local dev environment's search url to point at cloudelastic* and running the AnalyzeResults script in https://github.com/cormacparle/media-search-signal-test against the local search api
Acceptance criteria:
- a new search profile that uses the trained model
- a set of results from the AnalyzeResults script (see T271801)
To set up your local environment to search using cloudelastic:
outside vagrant:
ssh -n -L0.0.0.0:9243:cloudelastic1001.wikimedia.org:9243 mwdebug1002.eqiad.wmnet "sleep 36000"
inside vagrant:
sudo sh -c 'echo "10.0.2.2 cloudelastic1001.wikimedia.org" >> /etc/hosts'
in LocalSettings.php
<?php $wgCirrusSearchClusters = [ 'default' => [ [ 'host' => 'cloudelastic1001.wikimedia.org', 'port' => 9243, 'transport' => 'Https' ] ]; // Activate devel options useful for relforge $wgCirrusSearchDevelOptions = [ 'morelike_collect_titles_from_elastic' => true, 'ignore_missing_rev' => true, ]; $wgCirrusSearchIndexBaseName = 'commonswiki'; $wgCirrusSearchNamespaceMappings[ NS_FILE ] = 'file'; // Undo global config that includes commons files in other wikis search results unset( $wgCirrusSearchExtraIndexes[ NS_FILE ] );