Page MenuHomePhabricator

ES2.x fuzzy search no longer works as described in the browser tests
Closed, ResolvedPublic

Description

Scenario Outline: Searching for <text>~<number between 0 and 1> activates fuzzy search # features/fuzzy_api.feature:19
  When I api search for ffnonesensewor~<number>                                        # features/step_definitions/search_steps.rb:30
  Then Two Words is the first api search result                                        # features/step_definitions/search_steps.rb:312

  Examples: 
    | number |
    | .8     |
    expected [nil] to include "Two Words"
    | 0.8    |
    expected [nil] to include "Two Words"
    | 1      |

Basically, the ~1 postfix still works as before but using ~.8 or ~0.8 are returning no results when they were expected to return results.

Broken queries:

Details

Related Gerrit Patches:
mediawiki/extensions/CirrusSearch : es2.xRemove support for float fuzziness

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 26 2016, 10:32 PM
Restricted Application added projects: Discovery, Discovery-Search. · View Herald TranscriptApr 26 2016, 10:32 PM

Per https://lucene.apache.org/core/5_5_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Fuzzy_Searches

Previously, a floating point value was allowed here. This syntax is considered deprecated and will be removed in Lucene 5.0

Checked in our search logs, the usage of this feature is incredibly minimal. Always < 100 requests per day, often even lower. I manually reviewed queries from one day and not a single one looked to be intentionally using the feature, they all had some floating point number attached to the very last word and didn't seem to make any sense in terms of applying fuzziness. Might just be some bot, either way going to remove this feature from the browser tests.

hive (wmf_raw)> select day, count(1) from wmf_raw.cirrussearchrequestset where year=2016 and month=4 and requests[0].query REGEXP '\\~0*\\.[0-9]+($| )' group by day;
day     _c1
1       64
2       16
3       7
4       9
5       10
6       20
7       3
8       9
9       3
10      10
11      6
12      7
13      11
14      6
15      4
16      9
17      10
18      28
19      64
20      63
21      26
22      7
23      6
24      13
25      2
26      1
27      3
28      72
29      19

Change 286511 had a related patch set uploaded (by EBernhardson):
Remove support for float fuzziness

https://gerrit.wikimedia.org/r/286511

Change 286511 merged by jenkins-bot:
Remove support for float fuzziness

https://gerrit.wikimedia.org/r/286511

Deskana closed this task as Resolved.May 11 2016, 10:45 PM
Deskana triaged this task as Medium priority.