"San Lorenzo (quartiere di Napoli)" not first match when searching the words in different order
Closed, ResolvedPublic

Description

https://it.wikipedia.org/w/index.php?title=Speciale:Ricerca&search=quartiere+san+lorenzo+napoli&go=Vai

I'd expect a 100 % title match (apart from stopword, word order and punctuation) to be first.

Restricted Application added projects: Discovery, Discovery-Search. · View Herald TranscriptJun 29 2016, 10:36 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript
EBernhardson added a subscriber: EBernhardson.EditedJun 29 2016, 10:46 PM

See https://phabricator.wikimedia.org/T125083#2055892 for what is wrong, and why it's hard to fix. Discovery's next quarterly goal is to switch to bm25 and remove the referenced "all" field which finally makes it possible to address issues like this.

The short answer would be search as it was implemented a couple years ago improves weight on title matches by copying it multiple times to a field called the "all" field, and searching against that. It completely prevents any kind of optimization such as higher ranking when all words match the title.

debt triaged this task as Normal priority.Jul 1 2016, 4:40 PM
debt moved this task from Needs triage to This Quarter on the Discovery-Search board.

Thanks for the triage.

Deskana closed this task as Resolved.Nov 10 2016, 3:26 AM
Deskana claimed this task.
Deskana added a subscriber: Deskana.

Now that we've switched over to BM25, this is fixed!

Looks so indeed (at least for the original example), nice!