Page MenuHomePhabricator

Compare ZRR for query features across other search engines
Closed, ResolvedPublic10 Estimated Story Points

Description

@JustinOrmont had a neat idea based on T128118 wherein we could take a sample of queries exhibiting particular features (and/or combinations of features) and then compare our ZRR with Google's/Bing's/site:wikipedia.org/etc. to see which high-ZRR features have significantly lower ZRR on other search engines.

This could highlight certain query categories for us and help us prioritize our work on improving ZRR.

Event Timeline

debt triaged this task as Low priority.May 31 2016, 8:26 PM
debt moved this task from Needs triage to Later on the Discovery-Analysis board.
mpopov set the point value for this task to 10.Oct 18 2016, 8:32 PM
mpopov moved this task from Backlog to In progress on the Discovery-Analysis (Current work) board.

Like halfway done with this: https://github.com/wikimedia-research/Discovery-Search-Adhoc-SearchEngineComparison

Should be done by tomorrow or most likely Monday since I can see it taking all weekend to run the queries.

The bots encountered problems performing some searches, so the table below is incomplete, but should provide a good starting point for discussion. Not all queries were successfully searched for, so we include the fraction of zero result SERPs out of successfully searched queries on a per-engine basis. The "Proportion" column represents the % of full-text searches performed on EnWiki by U.S.-based website visitors on desktops in September and October of 2016.

Note: there is a copy of this table on GitHub that has prettier formatting.

FeaturesProportionCirrus ZRRGoogle ZRRYahoo ZRRBing ZRRDDG ZRR
[is simple]91.417227%12% (12/100)1% (1/100)3% (3/100)2% (2/100)2% (2/100)
[has even double quotes]5.888506%78% (78/100)20% (20/100)27% (27/100)25% (25/100)41% (30/74)
[ends with ?, has wildcard]0.569473%9% (9/100)0% (0/100)0% (0/100)0% (0/99)
[has wildcard]0.088912%62% (62/100)41% (41/100)42% (40/95)18% (16/90)
[has one double quote, has odd double quotes]0.059571%45% (45/100)6% (6/100)8% (8/99)7% (7/100)21% (21/98)
[has logic inversion (-)]0.042736%20% (20/100)6% (6/100)6% (6/100)11% (11/97)
[has wildcard, has even double quotes]0.009066%43% (43/100)43% (43/100)41% (40/98)36% (29/80)
[has logic inversion (-), has even double quotes]0.007691%46% (46/100)26% (26/100)25% (25/100)43% (40/92)
[ends with ?, has wildcard, has even double quotes]0.007300%34% (34/100)4% (1/25)7% (7/100)6% (6/100)13% (13/100)
[has logic inversion (!)]0.004586%14% (14/100)7% (7/100)6% (6/100)5% (5/94)
[has odd double quotes]0.003391%65% (65/100)28% (28/99)44% (44/99)44% (44/99)59% (49/83)
[has wildcard, has one double quote, has odd double quotes]0.002089%65% (65/100)35% (35/100)51% (49/97)45% (45/100)64% (54/84)
[ends with ?, has wildcard, has one double quote, has odd double quotes]0.001306%51% (51/100)10% (10/99)10% (10/100)36% (35/96)
[has logic inversion (-), has wildcard]0.000575%60% (40/67)33% (22/67)32% (21/65)16% (8/51)
[ends with ?, has logic inversion (-), has wildcard]0.000379%43% (39/91)8% (7/91)20% (17/87)19% (17/91)3% (3/89)
[has quot, has even double quotes]0.000302%100% (74/74)0% (0/1)5% (4/74)5% (4/74)23% (6/26)
[has logic inversion (-), has wildcard, has even double quotes]0.000261%13% (8/63)5% (3/63)5% (3/61)16% (10/63)
[has wildcard, has odd double quotes]0.000220%76% (41/54)56% (30/54)57% (31/54)56% (30/54)79% (38/48)
[has quot]0.000200%17% (8/48)0% (0/48)2% (1/48)0% (0/48)
[has logic inversion (!), has wildcard]0.000175%56% (24/43)42% (18/43)53% (23/43)58% (25/43)18% (7/39)
[has logic inversion (!), has even double quotes]0.000151%57% (21/37)27% (10/37)24% (9/37)24% (8/34)
[has logic inversion (-), has one double quote, has odd double quotes]0.000135%67% (22/33)30% (10/33)33% (11/33)58% (18/31)
[ends with ?]0.000073%84% (16/19)100% (19/19)100% (19/19)100% (12/12)
[has logic inversion (!), has one double quote, has odd double quotes]0.000073%50% (9/18)12% (2/17)28% (5/18)24% (4/17)53% (9/17)
[ends with ?, has wildcard, has odd double quotes]0.000069%94% (16/17)24% (4/17)24% (4/17)24% (4/17)82% (14/17)
[has logic inversion (-), has odd double quotes]0.000057%57% (8/14)21% (3/14)36% (5/14)36% (5/14)42% (5/12)
[ends with ?, has logic inversion (!), has wildcard]0.000020%20% (1/5)20% (1/5)20% (1/5)20% (1/5)0% (0/5)
[has logic inversion (!), has wildcard, has one double quote, has odd double quotes]0.000020%80% (4/5)20% (1/5)60% (3/5)60% (3/5)0% (0/3)
[has logic inversion (!), has wildcard, has even double quotes]0.000016%75% (3/4)75% (3/4)75% (3/4)75% (3/4)50% (2/4)
[ends with ?, has logic inversion (-), has wildcard, has even double quotes]0.000012%67% (2/3)33% (1/3)33% (1/3)67% (2/3)
[ends with ?, has logic inversion (-), has wildcard, has one double quote, has odd double quotes]0.000012%33% (1/3)33% (1/3)0% (0/3)0% (0/3)67% (2/3)
[has logic inversion (-), has wildcard, has one double quote, has odd double quotes]0.000012%33% (1/3)33% (1/3)33% (1/3)67% (2/3)
[has logic inversion (!), has odd double quotes]0.000012%67% (2/3)67% (2/3)67% (2/3)33% (1/3)67% (2/3)
[ends with ?, has wildcard, has quot]0.000004%100% (1/1)100% (1/1)100% (1/1)100% (1/1)0% (0/1)
[has logic inversion (-), has logic inversion (!), has even double quotes]0.000004%100% (1/1)100% (1/1)100% (1/1)
[has logic inversion (-), has wildcard, has odd double quotes]0.000004%100% (1/1)100% (1/1)100% (1/1)100% (1/1)100% (1/1)
[has wildcard, has quot]0.000004%100% (1/1)100% (1/1)100% (1/1)100% (1/1)

Taking Proportion * ZRR as a measure of impact for each feature set, it looks like double quotes are the feature to look at.

Other than "simple" it's the worst for all the search engines—but they have an impact of 1.18% (Google) to 2.41% (DDG) for the others, and 4.59% for Cirrus. Obviously, searching again without quotes seems like an obvious approach.

In other news, we may want to rethink how we classify ?, since we did change its behavior in Cirrus. \? is the wildcard now, and simple ? is ignored (though the posing of an apparent ?-final question is still a decent predictor of poor search performance).

Cool stuff!

Quite interesting. To add to TJones' comments, have you looked at what the
ZRR rate would be if you dropped the double quotes from properly quoted ZRR
queries? Aka simulate the automatic removal of double quotes if no (or few)
results were found.

--justin

... have you looked at what the ZRR rate would be if you dropped the double quotes from properly quoted ZRR queries?

Yep. I did that back when Mikhail first did his Zero to Hero report, looking at quotes and question marks. For quotes, replacing quotes with spaces (do deal with "this"kind of thing), it dropped the ZRR by almost half, putting it into DDG territory.

We went with question marks first—the problem was bigger and the solution even easier—but it's no surprise that quotes are the next biggest thing.

debt subscribed.

Thanks, all, resolving this ticket!