Update Elastica library to 5.0.0 and get CirrusSearch working with it
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	EBernhardson
	Jan 18 2017, 9:53 PM

Description

As part of the 5.x upgrade we should update our client library to the latest release, currently 5.0.0, as well. Work through the various problems and get cindy passing the browser test suite with this new library version.

Details

Subject	Repo	Branch	Lines +/-
Update search mappings for elasticsearch 5.x	mediawiki/extensions/GeoData	es5	+1 -13
Update browsertest search profile for es5	mediawiki/extensions/CirrusSearch	es5	+4 K -4 K
Mapping updates for ES 5.x	mediawiki/extensions/CirrusSearch	es5	+48 -26

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Resolved	debt	T151324 [epic] System level upgrade for cirrus / elasticsearch
Resolved	• Deskana	T154501 [Epic, Q3 Goal] Upgrade search systems to Elasticsearch 5
Resolved	EBernhardson	T155671 Update Elastica library to 5.0.0 and get CirrusSearch working with it

Event Timeline

EBernhardson created this task.Jan 18 2017, 9:53 PM

EBernhardson moved this task from Incoming to not in use - please delete on the Discovery-Search (Current work) board.

Change 333128 had a related patch set uploaded (by EBernhardson):
Mapping updates for ES 5.x

https://gerrit.wikimedia.org/r/333128

Change 333129 had a related patch set uploaded (by EBernhardson):
Update search mappings for elasticsearch 5.x

https://gerrit.wikimedia.org/r/333129

Took a bit of re-configuring my local vagrant (new instance for es5) to be happy, but it seems all except one test (that's not tagged @expect_failure) is failing now, it's a relevancy test which was pretty flaky last time we upgraded es versions as well. To have equal comparison i dumped cirrustestwiki_content from cirrus-browser-bot and imported it locally (otherwise docFreq's and such differ). Basically compare:

es2: http://cirrustest-cirrus-browser-bot.wmflabs.org/w/api.php?action=query&format=json&list=search&srsearch=Relevancyclosetest+Foo&srqiprofile=classic_noboostlinks&cirrusDumpResult&cirrusExplain=pretty
es5: http://cirrustest.wiki.local.wmftest.net:8080/w/api.php?action=query&format=json&list=search&srsearch=Relevancyclosetest+Foo&srqiprofile=classic_noboostlinks&cirrusDumpResult&cirrusExplain=pretty

From es2 the pages are ordered: 'Relevancytest foo' (691.8987), 'Relevancyclosetest Foô' (254.56299), 'Foo Relevancyclosetest' (233.97305)
For es5 the last two flip: 'Relevancytest foo' (853.5483), 'Foo Relevancyclosetest' (509.2177), 'Relevancyclosetest Foô' (502.59995)

Note that all of these have a *10 applied for language boost, so the difference between the last two is fairly small in both cases. The large change in scores i'm not too sure about, but likely its because the 0.5 coord factor dissapeared, along with a difference in the content of the two dbs.

Looking over the two explains, the scores break down as roughly:

'Relevancyclosetest Foô'
es2 : 0.5 (coord) (24.794167 (title) + 4.7421784 (suggest) + 4.4098306 (text)) + 8.483211 (phrase) = 25.45 * 10(lang) = 254.5
es5: 25.698328 (title) + 10.148193 (suggest) + 4.9270577 (text) + 9.486414 (phrase) = 50.25 * 10 (lang) = 502.5

On es2 suggest had the coord factor applied twice, the original suggest score was 9.484357 and it was cut in half to 4.74, then after summing the parts it was cut in half again. Getting the exact same scoring will be difficult, because the coordination factor isn't a static value we can just multiply our weights by. We should probably consider that:

Suggest weight should probably be lowered, to account for it not having the coordinating factor applied anymore
Phrase rescore may need a higher weight, as before the other portion of the query was having a coordinating factor applied and it no longer is.

The exact values though I'm not certain, will probably need some relforge testing (which means getting es 5.x on the relforge cluster though)

Change 333128 merged by DCausse:
Mapping updates for ES 5.x

https://gerrit.wikimedia.org/r/333128

Change 333988 had a related patch set uploaded (by EBernhardson):
Update browsertest search profile for es5

https://gerrit.wikimedia.org/r/333988

EBernhardson moved this task from not in use - please delete to Needs review on the Discovery-Search (Current work) board.Jan 24 2017, 9:54 PM

Change 333988 merged by DCausse:
Update browsertest search profile for es5

https://gerrit.wikimedia.org/r/333988

one last patch and we can move this to done: https://gerrit.wikimedia.org/r/#/c/333129/
This is now happy since the other patches to es5 branches have merged

Change 333129 merged by jenkins-bot:
Update search mappings for elasticsearch 5.x

https://gerrit.wikimedia.org/r/333129

EBernhardson moved this task from Needs review to Needs Reporting on the Discovery-Search (Current work) board.Feb 1 2017, 4:03 PM

• Deskana closed this task as Resolved.Feb 3 2017, 4:50 PM

Update Elastica library to 5.0.0 and get CirrusSearch working with itClosed, ResolvedPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Update Elastica library to 5.0.0 and get CirrusSearch working with it
Closed, ResolvedPublic
Actions

Related Objects
Search...