Page MenuHomePhabricator

Evaluate (and maybe fix) suggestion feature result changes in ES 2.x
Closed, ResolvedPublic

Description

A couple of our browser tests related to suggestions (the did you mean feature) are failing in 2.x:

Scenario: Customize max term freq did you mean suggestions                    # features/did_you_mean_api.feature:72
  When I set did you mean suggester option cirrusSuggMaxTermFreq to 0.0000001 # features/step_definitions/search_steps.rb:20
  And I set did you mean suggester option cirrusSuggConfidence to 1           # features/step_definitions/search_steps.rb:20
  And I api search for grammo                                                 # features/step_definitions/search_steps.rb:30
  Then there is no api suggestion                                             # features/step_definitions/search_steps.rb:452
    expected: nil
         got: "grammy" (RSpec::Expectations::ExpectationNotMetError)
Scenario: Customize min doc freq did you mean suggestions                    # features/did_you_mean_api.feature:78
  When I set did you mean suggester option cirrusSuggMode to popular         # features/step_definitions/search_steps.rb:20
  And I set did you mean suggester option cirrusSuggMinDocFreq to 0.99999999 # features/step_definitions/search_steps.rb:20
  And I api search for noble prize                                           # features/step_definitions/search_steps.rb:30
  Then there is no api suggestion                                            # features/step_definitions/search_steps.rb:452
    expected: nil
         got: "nobel prize" (RSpec::Expectations::ExpectationNotMetError)

The tests look to be written to specifically exclude the suggestions we are getting, fix this if possible

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Scenario: Customize max term freq did you mean suggestions # features/did_you_mean_api.feature:72

Scenario: Customize min doc freq did you mean suggestions # features/did_you_mean_api.feature:78

I'm not sure what to do with this, i thought perhaps min_doc_freq parameter changed how it works, but i wrote a small test script and both 1.7 and 2.3 are not doing what i expect (they both return nobel prize). FWIW i don't see any test cases in elasticsearch that deal with min_doc_freq in the phrase suggester.

#!/bin/sh

# Test setup:

curl -s -XDELETE localhost:9200/my_test | jq -c .
curl -s -XPUT localhost:9200/my_test -d '{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "my_type": {
      "properties": {
        "suggest": {
          "type": "string"
        }
      }
    }
  }
}' | jq -c .
curl -s -XPOST localhost:9200/my_test/my_type -d '{"suggest": "cat pictures"}' | jq -c .
curl -s -XPOST localhost:9200/my_test/my_type -d '{"suggest": "nobel prize"}' | jq -c .
curl -s -XPOST localhost:9200/my_test/my_type -d '{"suggest": "other things"}' | jq -c .
curl -s -XPOST localhost:9200/my_test/my_type -d '{"suggest": "test cases are fun"}' | jq -c .

curl -s -XPOST localhost:9200/my_test/_flush | jq -c .

# Search query:
curl -s -XGET localhost:9200/my_test/my_type/_search -d '{
  "suggest": {
    "text": "noble prize",
    "suggest": {
      "phrase": {
        "field": "suggest",
        "direct_generator": [ {
          "field": "suggest",
          "suggest_mode": "popular",
          "min_doc_freq": 0.999999
        } ]
      }
    }
  }
}' | jq .suggest.suggest

I remember having huge pain to make these tests pass I suppose they mostly depend on index/shards state, will check.

Change 287061 had a related patch set uploaded (by DCausse):
Remove phrase suggester tests based on term frequencies

https://gerrit.wikimedia.org/r/287061

Deskana triaged this task as Medium priority.