Page MenuHomePhabricator

Testing needed for the add of DEFAULTSORT keys to wiki search autocomplete
Open, Stalled, LowPublic

Description

We've added all the necessary code to experiment with defaultsort data and we now need to build some experimental indices before activating it on production wikis

We know of some possible drawbacks that could happen:

  • bad suggestions when defaultsort is set to a non representative text
  • hides too many suggestions

A test index should help to discover obvious problems, but replaying queries from search logs to detect if the result chosen is now hidden should really help to see any negative impact.

Recent checkin's (for reference):
Added DEFAULTSORT to search index field data
Added support for FLAG_SOURCE_DATA and defaultsort in completion suggester

Event Timeline

note on size for enwiki :

with default sortwithout
1,073,416,174938,385,588

That's a +14% increase, I believe that this feature is interesting only for wikipedias (but I can be wrong). We should wait for T144387 to have a better understanding of the current state.

Next step on this ticket is to run some relforge tools to measure the difference between the two approaches.
Small demo for enwiki: http://mw-sug-subpages-relforge.wmflabs.org/w/default_sort_demo.html

Preliminary results on a 10k queries sample:

Baseline: ./relevance/queries/normal/results

Metrics:

Query Count: 10001
Zero Results Rate: 23.8%
Poorly Performing Percentage: 36.9%
Top 3 Sorted Results Differ: 19.5%
Top 3 Unsorted Results Differ: 18.9%
Top 5 Sorted Results Differ: 25.3%
Top 5 Unsorted Results Differ: 24.4%
Top 20 Sorted Results Differ: 31.7%
Top 20 Unsorted Results Differ: 30.3%

Delta: ./relevance/queries/defaultsort/results

Metrics:

Query Count: 10001
   Num TotalHits Changed: μ: 0.05; σ: 0.38; median: 0.00; range: [-1, 8]
   Pct TotalHits Changed: μ: 1.8%; σ: 16.5%; median: 0.0%; range: [-10.00%, 400.00%]
   Charts [ + ]

Zero Results Rate: 23.6% (-0.2%)
Poorly Performing Percentage: 36.4% (-0.4%)
Top 3 Sorted Results Differ: 19.5%
Top 3 Unsorted Results Differ: 18.9%
Top 5 Sorted Results Differ: 25.3%
Top 5 Unsorted Results Differ: 24.4%
   Num Top 5 Results Changed: μ: 0.46; σ: 0.96; median: 0.00; range: [0, 5]
   Pct Top 5 Results Changed: μ: 10.0%; σ: 23.0%; median: 0.0%; range: [0.00%, 400.00%]
   Charts [ + ]

Top 20 Sorted Results Differ: 31.7%
Top 20 Unsorted Results Differ: 30.3%
   Num Top 20 Results Changed: μ: 0.83; σ: 1.65; median: 0.00; range: [0, 10]
   Pct Top 20 Results Changed: μ: 9.6%; σ: 23.1%; median: 0.0%; range: [0.00%, 400.00%]

Default sort had a positive impact on zero result rate, most of the queries that return a result thanks to default sort seem to be interesting:

querydefaultsort result
dunbar richardsRichard Dunbar
rednic mirceaMircea Rednic
moscu elenaElena Moșuc
CANALE DOMENICODomenico Canale
coolidge calviCalvin Coolidge
okereke stefhanStephanie Okereke Linus
strathmore and perthshire cricketStrathmore & Perthshire Cricket Union

Not so great new results:

queryresult
Cannon JugHughie Cannon
st. motelGreg Stotelmyer

I'll continue to investigate, the difference is huge between the two strategies (30.3% of the queries returned a different set of results)

This is still in progress with no movement since september, is there anything else we want to do here?

dcausse changed the task status from Open to Stalled.Jan 31 2017, 7:20 PM

Code is deployed, and the problem right now is that we need some testing to make sure that these new suggestions won't cause any major problems and I'm not sure how to do this.
This feature is done at index time and we can't make it configurable as a preference option.
Options would be

  • A/B testing (maybe it'd require some specific code, build an index with defaultsort enabled in codfw, the control group would run in eqiad)
  • if some wikis are willing test this new feature, we could activate it and wait for feedback to slowly enable it on other wikis if proven useful.

Anyways, I'd suggest to wait for the es5 upgrade because completion is being refactored.

debt lowered the priority of this task from Medium to Low.EditedMar 21 2017, 5:26 PM
debt edited projects, added Discovery-Search; removed Discovery-Search (Current work).

We'll need to do some more testing on this - by enabling it on wiki for a week or so and doing the analysis. We need to make sure that this won't cause more problems than it solves.

We've implemented the flag in the code, but haven't turned it on...due to not sure if it'll cause issues rather than fix them.

Moving to the backlog board for now.

Different languages use DEFAULTSORT differently.

Examples:
*Chinese is not a phonetic writing system and there's a tradition on zh.wp to use the first few letters of the phonetic transcription of the title as a sort key. 北京時間 (Pinyin: Beijing Shijian) gets {{DEFAULTSORT:B}}. It would not be useful to use DEFAULTSORT as a keyword for search results because it's an incomplete fragment of the phonetic transcription.
*Japanese uses a diacritic to mark voiced consonants but dictionaries sort topics regardless of the consonant. 動物 (どうぶつ doubutsu) is sorted together with とうふつ toufutsu, but a Japanese speaker with a Japanese keyboard wouldn't search とうふつ expecting to find どうぶつ / 動物.

My general feeling as a multilingual Wikimedian is that the use of DEFAULTSORT is wiki-specific. In many cases where the DEFAULTSORT keys are useful search keys, a redirect would already have been created. I think it would be useful to have the option to turn on DEFAULTSORT as a search metric, but the specific language or wiki must be consulted before assigning non-zero weight to a particular wiki.