Page MenuHomePhabricator

Search can't find existing entries
Closed, InvalidPublic

Description

Author: espici_2

Description:
I have tried numerous searches for entries which should (and do!)
exist in Wikipedia, but, for some bizarre reason, the search
mechanism can't find them.

For example, I entered "your show of shows" (the classic TV comedy
show, starring Sid Caesar, et al.), both with double-quotes and
without, yet the mechanism fails to find it! However, when I
enter "sid caesar", lo-and-behold, "Your Show of Shows" appears
within the found list.

This isn't the only example.

Try it with "all in the family", "big world little adam" (I found
it once, but can't find it, now!), etc.


Version: unspecified
Severity: normal
OS: Windows XP
Platform: PC

Details

Reference
bz8104

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 21 2014, 9:33 PM
bzimport set Reference to bz8104.
bzimport added a subscriber: Unknown Object (MLST).

jdcrunchman wrote:

I'm having the same problem, this totally SUCKS, because my use of the WIKI depends on a reliable search... I'm only hitting searches about 75% of the time. This is unacceptable to me.

Is there any effort to fix this? Are there other WIKI's out there that have more reliable search functions?

brion added a comment.Feb 4 2008, 7:29 PM

Regarding the original complaint, most likely the issue was that the reporter looked only for the exact-match "go" results, which wouldn't turn up exact matches under the old case-sensitive search. They should have returned regular search results, however, unless the search server was down or broken at the time (which is possible).

Currently we have a case-insensitive exact match, which returns results for all the above examples except the last (which doesn't appear to exist on English Wikipedia at the moment.)

As for this second comment, an additional liklihood is the 4-character minimum length in MySQL's default configuration (see the FAQ) and the MySQL stopword list which ignores a lot of short common words and numbers.

I'm going to go ahead and INVALID this bug as it's not really targetting anything specific at the moment.

Change 215229 had a related patch set uploaded (by Ori.livneh):
Don't rely on strip marker uniqueness

https://gerrit.wikimedia.org/r/215229

Change 215239 had a related patch set uploaded (by Ori.livneh):
Don't rely on strip marker uniqueness

https://gerrit.wikimedia.org/r/215239

Change 215240 had a related patch set uploaded (by Ori.livneh):
Don't rely on strip marker uniqueness

https://gerrit.wikimedia.org/r/215240

Change 215239 merged by jenkins-bot:
Don't rely on strip marker uniqueness

https://gerrit.wikimedia.org/r/215239

Change 215240 merged by jenkins-bot:
Don't rely on strip marker uniqueness

https://gerrit.wikimedia.org/r/215240

Change 215229 merged by jenkins-bot:
Don't rely on strip marker uniqueness

https://gerrit.wikimedia.org/r/215229