Page MenuHomePhabricator

MySQL-Search should sort results by relevance
Open, LowestPublicFeature

Description

MySQL-based search engine used per default does not appear to sort results in
any meaningful way. I have written a small extension that extends SearchMySQL4
to use sorting by relevance (attachment follows), but the data set on my
personal test wiki is not well suited to test it.

I'm a bit confused about MySQL fulltext search, and thus this extension may be
completely pointles. The relevant documentation is at
http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html. A few observations:

  • SearchMySQL4 uses the IN BOOLEAN MODE modifier

(http://dev.mysql.com/doc/refman/4.1/en/fulltext-boolean.html). This appears to
cause MySQL to report the relevance at 1.0 for anything that matches, making my
patch pointles. The documentation confirms this behaviour: " They do not
automatically sort rows in order of decreasing relevance". This also confirms
the problem this bug report tires to address.

  • After some testing, the way to get a weighted search result with boolean

matching appears to be this:

SELECT page_id, page_namespace, page_title,

MATCH(si_text) AGAINST('Quux') as rank

FROM page,searchindex
WHERE page_id=si_page
AND MATCH(si_text) AGAINST('Quux' IN BOOLEAN MODE)
AND page_is_redirect=0
AND page_namespace IN (0)
ORDER BY rank DESC

  • For some reason though, this "sometimes" gives a rank of zero (but still a

boolean match) on entries that contain the search string (maybe a wordlength
limit? seems unlikely though for the things i've tried). Consequently, not using
the BOOLEAN modifier at all causes some matches (the ones with rank 0) not to show.

As I said, I'm a bit confused, but this is probably worth looking into. The
search feature would be vastly more useful with decent ranking.


Version: 1.7.x
Severity: enhancement

Details

Reference
bz5992

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 9:13 PM
bzimport added a project: MediaWiki-Search.
bzimport set Reference to bz5992.
bzimport added a subscriber: Unknown Object (MLST).

Created attachment 1761
extension modifying SearchMSQL4 to order by rank (not really functional, see initial comment)

Attached:

Can this very old report be now seen under the light of Cirrus Search?

Nope, Cirrus has nothing to do with the core database-backed search implementation. It's up to core to implement this if it's still desired.

It's not even a problem in Cirrus/MWSearch world at all.

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Deskana subscribed.

Removing Discovery-Search; we do not support the SQL search in core as it is not used in production on Wikimedia wikis.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:02 AM