Page MenuHomePhabricator

Searches that are complex and/or ask for a large number of results falsely return 'no results'
Closed, DeclinedPublic

Description

Author: matthew.britton

Description:
Search seems to persistently fail for certain namespace selections, but not others, even others that include those selections.

Example: searching on en.wikipedia for 'huggle'

(article) - works
user only - works
user talk only - fails
project only - fails
project talk only - fails
user + any of [user talk, project, project talk] - succeeds and returns pages from all the selected namespaces

By 'fails', I mean displays the "Note: unsuccessful searches are often caused ..." message, even though results clearly exist.

This has occurred every time I've tried in the last two weeks; reloading, clearing cache, using a different browser, machine or IP address does not fix the problem.

The problem is compounded by the fact that I am unable to use an external search engine to obtain many of these results.


Version: unspecified
Severity: major

Details

Reference
bz16236

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 10:21 PM
bzimport set Reference to bz16236.

rainman wrote:

Cannot reproduce. Was anyone able to reproduce this?

matthew.britton wrote:

OK after some testing it seems to be related to your search preferences.

In particular increasing the "hits per page" option... I had mine at 100 rather than the default of 20, before, but whack it up to something like 1000 and this bug occurs in many more cases.

I guess MediaWiki doesn't like queries that big... it should do something less confusing than just returning the "no matches" message, though.

matthew.britton wrote:

Oh, note also that clicking one of the higher links in the (20 | 50 | 100 | 250 | 500) line will give the same result, and is somewhat quicker than setting your preferences.

rainman wrote:

This is likely due to the way results are fetched on the backend. Results are fetched in a single thread, in a linear way from many index parts, so for large queries time accumulates for just more than 3 seconds, which is the default timeout in MWSearch. Possible solution would be:

  1. increase timeout in MWSearch Http::get call to e.g. 6 seconds - obviously this would produce longer waiting times for the user
  2. fetch results on the backend in parallel - this would produce additional synchronization and slow down typical queries, possibly bogging down searchers with syncrhonization

matthew.britton wrote:

(In reply to comment #5)

This is likely due to the way results are fetched on the backend. Results are
fetched in a single thread, in a linear way from many index parts, so for large
queries time accumulates for just more than 3 seconds, which is the default
timeout in MWSearch. Possible solution would be:

  1. increase timeout in MWSearch Http::get call to e.g. 6 seconds - obviously

this would produce longer waiting times for the user

  1. fetch results on the backend in parallel - this would produce additional

synchronization and slow down typical queries, possibly bogging down searchers
with syncrhonization

I'm not too bothered about the actual queries timing out, just the way it's presented to the user. A "search timed out" error message would be fine, just something other than giving the impression there are no results for the query.

rainman wrote:

*** Bug 16522 has been marked as a duplicate of this bug. ***

Bumping -- I think I saw some adjustments for the timeouts recently. Does this take care of it?

matthew.britton wrote:

(In reply to comment #8)

Bumping -- I think I saw some adjustments for the timeouts recently. Does this
take care of it?

Some queries (such as the ones in comment #1) that used to fail now no longer do, so to some extent yes.

It is still possible to get a time-out by asking for even more results or using more complex search terms/namespace filters, but as I stated in comment #6, it isn't really the timeout that's the problem, it's that no indication is given to the user that this has happened, and they are led to believe there are no results.

It would be more helpful to tell them to ask for fewer results per page and/or try a less complex search.

There is a difference between the output from a search that returns no results and a timed-out search. A search with no results has a "No page text matches" <h2>, followed by a "Note: Choosing the right search terms..." message. A timed-out search does not have the <h2>, but does have the additional message. Since this difference exists MediaWiki can presumably distinguish the two states and inform the user.

rainman wrote:

*** Bug 17944 has been marked as a duplicate of this bug. ***

tested, doesn't happen anymore.

sumanah wrote:

This is happening intermittently but more often in the last 3 weeks or so. Reopening, marking "high" and asking some of our Java volunteers to take a look if they have a moment.

An example query is welcome in order to reproduce...

sumanah wrote:

A search for "sandwich" gets zero results

I'm trying to reproduce it and it's intermittent. It JUST happened now regarding the word "sandwich".

Attached is a screenshot where you see that the search turned up zero results. I had used the search box in the upper right hand corner and
https://www.mediawiki.org/w/index.php?search=sandwich&title=Special%3ASearch&fulltext=1 gave me "There were no results matching the query."

Attached:

Screenshot_at_2012-12-31_11:59:31.png (744×1 px, 167 KB)

sumanah wrote:

A search for "sandwich" gets TWO results less than a minute later

I then searched again (I think it was via that same search box in the upper right hand corner, since as you can see in the screenshot the search term has been trapped there) and got 2 results -- see screenshot. The URL was different this time:

https://www.mediawiki.org/w/index.php?search=sandwich&button=&title=Special%3ASearch

Results:

https://www.mediawiki.org/wiki/Mobile_mockups_for_Android_style

Mobile mockups for Android style
The action bar on Honeycomb and Ice Cream Sandwich tablets is similar to on ICS phones, and could have more or fewer items on it. ...
3 KB (516 words) - 10:40, 10 October 2012

https://www.mediawiki.org/wiki/Language_tools/Daily

Language tools/Daily
Stand-up meeting 2011-12-23 Skype call initiated by _ (08:00 UTC, _ minutes): plan on buying an ice cream sandwich. need to finish document on ...
605 KB (67,542 words) - 14:04, 29 October 2012

Attached:

Screenshot_at_2012-12-31_11:59:50.png (744×1 px, 180 KB)

I'm closing this again as WORSKFORME as this seems to be better handled in bug 42423.