Page MenuHomePhabricator

Properly handle new elasticsearch offset limit
Closed, ResolvedPublic

Description

Since we upgraded to elastic 2.3.3 we started to see new errors in cirrus logs:

2016-06-03 12:30:45 [] mw1245 jawiki 1.28.0-wmf.4 CirrusSearch WARNING: Search backend error during full_text search for 'query text' after 97: query_phase_execution_exception: Result window is too large, from + size must be less than or equal to: [10000] but was [90360]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter. {"queryType":"full_text","query":"query text","limit":20,"suggestion":null,"took":97,"message":"query_phase_execution_exception: Result window is too large, from + size must be less than or equal to: [10000] but was [90360]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level parameter."}

We should not send such queries to elasticsearch and block them in MW.

Event Timeline

Change 295310 had a related patch set uploaded (by EBernhardson):
Adjust Searcher maximum result depth

https://gerrit.wikimedia.org/r/295310

It looks like elasticsearch added a new limit, defaulting to 10,000 in 2.x. Prior to this we have been using our own limit of 100,000 in CirrusSearch. We could adjust the indices to all have the 100,000 item limit, but 10k already seems plenty deep for any reasonable use case. As such i've pushed a patch to update the limits in Cirrus to match the 10k set by elasticsearch.

Change 295310 merged by jenkins-bot:
Adjust Searcher maximum result depth

https://gerrit.wikimedia.org/r/295310