Page MenuHomePhabricator

Boost search results with exact phrase match
Closed, ResolvedPublic

Description

Reproduce: https://en.wikipedia.org/w/index.php?search=Ingo+Heinrich&title=Special:Search&go=Go
Expected: We will see https://en.wikipedia.org/wiki/Chris_Gueffroy at the top of the result as it includes the exact search term (and is the only article with the exact search term)
Actual: This page is not within top 100 results and to find the page easily we need to add double quotes around it

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@Aklapper I think this is slightly different.

technical notes: the mlr profile pushed the expected result around rank #160, the retrieval query ranks it at #30. This query was already showing bad ranking prior to mlr (not in top-10).
I don't think we include any phrase match feature in the mlr feature set, without digging too deeply in this case I don't see any other obvious feature that could help this particular query.
@EBernhardson would it be possible to include phrase match on text and/or text.plain as part of the feature evaluation you're running?

dcausse renamed this task from Boost search results with exact search term to Boost search results with exact phrase match.Feb 14 2018, 1:45 PM

I can add a phrase match, but i'm not sure phrase match will be enough to push this single occurance all the way to the top. For this specific page a redirect containing the name might do the trick, but that's not as generalizable. Longer term I think named entity extraction has some potential here. A phrase match and a named entity match might (but needs to be evaluated) be enough.

I have a test model up that currently pushes Chris to the top: https://en.wikipedia.org/wiki/Special:Search?search=Ingo+Heinrich&fulltext=1&cirrusMLRModel=20180118-query_explorer-enwiki-v2

Edit: Updated model to one that is significantly less expensive, should be able to AB test this one after the next cluster restart.

EBjune triaged this task as Medium priority.Feb 22 2018, 6:18 PM
EBjune moved this task from needs triage to Current work on the Discovery-Search board.
debt subscribed.

Nice!