Page MenuHomePhabricator

Test ElasticSearch suggester to see if it meets user needs better than PrefixSearch
Closed, ResolvedPublic

Description

David wrote a cool document about using the ElasticSearch suggester: https://docs.google.com/document/d/1pn64e9Tb_ZBbR470K9dofr79yINz9Xzs8t6bVUb1yUo/edit#

There's now a demo here: https://suggesty.wmflabs.org/suggest.html

We should run a test to see if this suggester meets user needs better than the prefixsearch in the search box at the top right of pages on wikis.

Event Timeline

Deskana created this task.Jul 13 2015, 9:14 PM
Deskana raised the priority of this task from to Normal.
Deskana updated the task description. (Show Details)
Deskana added a project: Discovery.
Deskana added a subscriber: Deskana.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 13 2015, 9:14 PM

meets user needs

What's the status of this metric?

What's the status of this metric?

Our goal in Q1 2015-16 is to reduce zero results rate, so that is the objective here.

Open questions:

  1. What wikis do we test this on?
  2. Do we offer users an opt out?
  3. Do we test it with some small percentage of users and compare them to the baseline?
  4. Do we test it 50/50 with existing users (i..e an A/B test) and use some data collection to figure out which is best?
  5. How quickly can we turn it off if it's not working?

More far-reaching questions:

  1. Why don't we just try this in the Android app? We could create this as an API and try it there.

Here's an example of A/B testing in the Android app: https://gerrit.wikimedia.org/r/#/c/218983/

dcausse added a subscriber: dcausse.EditedJul 14 2015, 9:26 AM

Just to add a bit more context to this task.

The Completion suggester described in the document won't work well out of the box and what I did is a rough prototype.
I think there is several tasks to do before doing a live test.

1. Scoring
Suggestions are sorted by a score computed at index time. Today I just computed a very basic composite score on data available in the dump:

  • number of incoming links
  • number of external links
  • page size
  • number of headings
  • number of redirects
  • penalty factor on disambiguation pages

Unfortunately none of those will allow us to score pages "correctly". Weighting higher the number of incoming links seems to be the best trade-off. But some pages have a very high number of incoming links (i.e. dates) and do not deserve being ranked so high. I think we should investigate adding a new score component that is related to the number of pageview statistics (https://dumps.wikimedia.org/other/pagecounts-raw/).

2. Analyzers
The prototype was configured with a very basic analysis chain. Depending on the language we should configure it correctly.

3. Multiple suggestions
Suggestion are returned according to their computed weight. If you enable fuzzy searches, pages that have a lot of typos won't be scored lower than the ones that match perfectly. The solution I tested is to run multiple suggestions at the same time :

  • exact
  • fuzzy with a penalty factor of 0.2
  • exact (stop words filtered) with a penalty factor of 0.3
  • fuzzy (stop words filtered) with a penalty factor of 0.1

The multiple suggestions are aggregated on the backend (HTML page in the proto) and the top 10 suggestions will be sent back to the client.
I think there is some work here also: evaluate the best combination of suggestions and penalty factors. Something that would be nice also is a kind of "cutoff": i.e. when I type a typo I may want to see only the "best" suggestions and filter pages that have a very low score.

4. Redirects
Some pages have a lot of redirects (https://en.wikipedia.org/wiki/United_States?action=cirrusdump):

  • If I type "Un" I think United States can be in the top 10 suggestions because of its high score.
  • But if I type "Ya" I think it's not fair to suggest "Yankee land" in the top 10 (yankee land redirects to United States).

What I tried is to group redirects of a single page into "similar" groups and apply a penalty factor if the group is "far" (levenshtein distance) from the "official page name".
This is very hazardous and deserves a lot more work.
Another problem of redirects in existing popular wikis is that they contain already typos:

  • Jurrassic Park redirects to Jurassic Park
  • Airton Senna redirects to Ayrton Senna

Redirects seem to be a tool used by the community to allow "fuzzy suggestions" today.
I really don't know how to deal with that, suggesting something with a typo is not ideal... Can we curate data with wikidata?

5. It's not scrollable
Prefix search results are scrollable (I think), Suggestions are not scrollable by nature. Google shows only the top 4 and I think there is good reasons for that.

You can have a look here: https://github.com/nomoa/suggester-prototype/
(Sorry this is very quick and dirty)

Deskana updated the task description. (Show Details)Sep 1 2015, 6:57 PM
Deskana set Security to None.
EBernhardson closed this task as Resolved.Sep 29 2016, 4:10 PM
EBernhardson claimed this task.
EBernhardson added a subscriber: EBernhardson.

We built, tested, and deployed this feature. seems resolved.