Analysis of Method 1 Suggestion results
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	TJones
	Sep 12 2019, 4:50 PM

Description

Gather suggestion output from Elastic-based suggestions and Method 1 suggestions for a collection of data, and analyze the results.

When we did this for M0, we used 2 months of enwiki data to build the model and evaluated the results on 1 month of enwiki data. Something similar would be fine this time, too.

Analysis will include counting how often Elastic-based suggestions are made, how often Method 1 suggestions are made, how often both are made, and a manual review of a sample when both are made to see which does better—which is the same as what we did for M0.

There is some concern about the possibility of lower-quality Method 1 results for shorter strings, so if that looks to be a problem—either because of the high volume and/or lower quality of Method 1 suggestions for shorter queries—we may look into shorter queries more carefully.

Related Objects
Search...

Status	Assigned	Task
Open	None	T212884 [EPIC] Improve Search Suggestions with NLP (Did You Mean / Glent)
Open	None	T212889 [EPIC-ish][Milestone 1] Implement NLP Search Suggestion Method 1 for 10 languages
Duplicate	None	T235828 [Epic] Enhance search suggestions to allow for easier access to results
Invalid	None	T235830 Glent method 1 (comparison to other users' queries) offline tested, tuned, A/B tested and possibly deployed end of Q2
Resolved	TJones	T232760 Analysis of Method 1 Suggestion results

Event Timeline

TJones renamed this task from Analysis of M1 results to Analysis of M1 Suggestion results.Sep 12 2019, 4:50 PM

TJones renamed this task from Analysis of M1 Suggestion results to Analysis of Method 1 Suggestion results.

TJones created this task.

TJones moved this task from needs triage to elastic / cirrus on the Discovery-Search board.

TJones updated the task description. (Show Details)

Samples are available here: notebook1004:/home/dcausse/phrase_suggester_vs_glent_m1.csv

TJones claimed this task.Oct 11 2019, 7:24 PM

TJones edited projects, added Discovery-Search (Current work); removed Discovery-Search.

I completed my analysis of Method 1, and it performs significantly worse than the current production DYM. I think we should improve Method 1 before considering an A/B test. Full details on MediaWiki.

Summary:

When Method 1 and production DYM disagree, Method 1 was better than prod 7% of the time, both were good about 26% of the time, and prod was better 40% of the time.
Analyzed alone, prod suggestions were poor 45% of the time (which is why we are here!), but Method 1 suggestions were poor 72% of the time; when they agree, only 23% of suggestions were poor.

Method 1 Anti-Patterns:

over-emphasis of result counts—
- creating negated queries, like fogus to -ous which gets 5.9M results.
- changing letters or adding spaces to create a very common word (cf gene to a gene) or duplicated word (rattle battle to battle battle).
overly drastic changes—
- edit distance limits should be per-token, not per string (cf gene to a gene again)
- changing a letter to space should have a higher cost (abbys to a b s)
- changing the first letter of a word/token should have a higher cost (cia assassinations to mi6 assassinations)
using weird stemming edge cases to increase result counts—
- e.g., godness stems to god so it beats goddess; hering stems to here so it replaced herring in red herring

Reinforcing Positive Method 1 Patterns:

Edit distance cost should be decreased for double-letter to single-letter change (or vice versa)
Edit distance cost should be decreased for swapped letters, possibly including swapped with a letter in between (levasimole vs levamisole)

I realize that I've assumed that edit distance plays a role in the weighting of suggestions, but I'm not sure that's the case. If not, it probably should be, rather than letting result count reign supreme.

Gehel added a parent task: T235830: Glent method 1 (comparison to other users' queries) offline tested, tuned, A/B tested and possibly deployed end of Q2.Oct 18 2019, 9:22 AM

TJones moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board.Oct 23 2019, 3:15 PM

Gehel closed this task as Resolved.Oct 29 2019, 5:51 PM

TJones mentioned this in T238151: Tune Glent Method 1 algorithm.Nov 12 2019, 9:14 PM

Analysis of Method 1 Suggestion resultsClosed, ResolvedPublicActions

Description

Related ObjectsSearch...

Event Timeline

Analysis of Method 1 Suggestion results
Closed, ResolvedPublic
Actions

Related Objects
Search...