**Note**: this ticket has been rewritten to reflect new data gathered in Feb 2021
@Miriam and @AikoChou have done logistic regression on the data gathered by https://media-search-signal-test.toolforge.org/ to transform the score for each search signal into a probability that an image is a good image for the search term
We need to translate this into an initial elastic search profile with a query builder that will compute a probability-of-an-image-being-good based on the results of the logistic regressions, and return **that** as the score
Implementation
---
The probability of an image being good based on the elasticsearch scores for elasticsearch search field is
```
1 / ( 1 + exp( -1 * ( ( coefficient_for_field_A * score_for_field_A ) + ( coefficient_for_field_B * score_for_field_B ) + ... + intercept ) ) )
```
| field | coefficient |
| descriptions.plain | -0.02159655 |
| descriptions | 0.04977869 |
| title | 0.05276279 |
| title.plain | 0.04668993 |
| category | 0.02615154 |
| category.plain | 0.02615154 |
| redirect.title | 0.00636762 |
| redirect.title.plain | 0.01282327 |
| suggest | -0.02016103 |
| auxiliary_text | -0.02192281 |
| auxiliary_text.plain | 0.02003289 |
| text | -0.04702737 |
| text.plain | 0.03902806 |
| statements | 0.10792615 |
The `statements` field in this case is a `dis_max` of matches on the `statement_keywords` field with `P180=` and `P6243=` concatenated with the top 50 entity matches (via `MediaSearchEntitiesFetcher`) **without** any additional boost/decay
This will need to be implemented using `function_score` or similar queries in elasticsearch
Testing
---
See T271801 for how to test each profile that we construct in this way and decide if it's better or worse