Page MenuHomePhabricator

[L] Compare accuracy of MediaSearch using the new weighted_fields data to data returned from Image Matching Algorithm
Closed, ResolvedPublic

Description

NOTE: T286563 must be done first

Tune the new search profile created in T286563:

  1. Create a new dataset for training elasticsearch using the steps described here
  2. Tune the elasticsearch scores used in the new search profile
  3. Run AnalyzeResults.php in https://github.com/cormacparle/media-search-signal-test/ for the newly-tuned new search profile and for the old profile
  4. paste the 2 scores here

Event Timeline

CBogen renamed this task from Compare accuracy of MediaSearch using the new weighted_fields data to data returned from Image Matching Algorithm to [L] Compare accuracy of MediaSearch using the new weighted_fields data to data returned from Image Matching Algorithm.Jul 14 2021, 4:46 PM

First iteration of this adds a query on each of the new weighted_tags fields for each wikidata item we have found corresponding to the search term (the same set of wikidata ids we use for statement matching)

See https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseMediaInfo/+/732956/4/

Analysis:
Control

F1 Score0.65022753730554
Precision@100.8750981932443
Precision@250.84593190998269
Precision@500.81521739130435
Precision@1000.78602620087336
Recall0.58093797276853
Average precision0.4848566817268

Using weighted tags iteration 1

F1 Score0.65741220346006
Precision@100.86902927580894
Precision@250.840820854132
Precision@500.80975185023944
Precision@1000.77704576976422
Recall0.59890524726312
Average precision0.49781306577307

So either the new data is not useful, or I'm not using it in a useful way in the new search profile. Created T296309 to try a different approach

Iteration 2 (see https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseMediaInfo/+/732956/6)

Using weighted tags iteration 2

F1 Score0.643027356782
Precision@10.95616438356164
Precision@30.9271978021978
Precision@100.89629068887207
Precision@250.85527034407428
Precision@500.82389380530973
Precision@1000.77945205479452
Recall0.5817523283891
Average precision0.48841066947357

So a 1% increase in precision@25, which is probably good enough to make use of in production. Might try one more iteration though ...

Iteration 3

Using weighted_tags iteration 3

F1 Score0.68837871507386
Precision@10.97822931785196
Precision@30.95238095238095
Precision@100.91096271563717
Precision@250.86794019933555
Precision@500.834179357022
Precision@1000.78722237455443
Recall0.65648336727766
Average precision0.57191605106942

So a little over 2% improvement for precision@25, with all other scores improved too. Bit disappointing tbh, but enough to proceed with weighted_tags on production