Right after deploying a config change to activate ltr on enwiki we saw errors:
Search backend error during full_text search for 'query text' after 39: i_o_exception: Can't read unknown type [50]
This is certainly caused by the ltr plugin.
Right after deploying a config change to activate ltr on enwiki we saw errors:
Search backend error during full_text search for 'query text' after 39: i_o_exception: Can't read unknown type [50]
This is certainly caused by the ltr plugin.
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Invalid | None | T174064 [FY 2017-18 Objective] Implement advanced search methodologies | |||
Resolved | EBernhardson | T161632 [Epic] Improve search by researching and deploying machine learning to re-rank search results | |||
Resolved | EBernhardson | T175772 Deploy MLR as default content search to enwiki | |||
Resolved | dcausse | T175951 Search backend error during full_text search for 'QUERY_SRTING' after 39: i_o_exception: Can't read unknown type [50] |
Mentioned in SAL (#wikimedia-operations) [2017-09-14T19:58:27Z] <dcausse> banning elastic1020 to see if T175951 is caused by mixed versions of the ltr plugin
This is definitely due to a mixed version of the ltr plugin being deployed on elastic1020.
The binary format of the sltr changed between these versions making it impossible to use sltr with elastic1020 and the rest of the servers.
Restarting more nodes will likely exacerbate the problem with a peak the in the middle of the rolling restart.
I think we have few options:
or simply undeploy the config change to activate ltr.
Mentioned in SAL (#wikimedia-operations) [2017-09-15T07:35:23Z] <gehel> depooling elastic1020 - T175951
Mentioned in SAL (#wikimedia-operations) [2017-09-15T07:38:59Z] <gehel> shutting down and masking elasticsearch on elastic1020 - T175951