Page MenuHomePhabricator

Latest CirrusSearch is incompatible with ES7.10 and the corresponding WMF extra plugin
Closed, ResolvedPublic2 Estimated Story PointsBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Set up ElasticSearch 7.10 with the Wikimedia extra plugin (7.10.2-wmf12).
  • Install CirrusSearch from the master or REL1_45 branch.
  • Enable the regex capabilities via wgCirrusSearchWikimediaExtraPlugin (probably not needed but unable to check at the moment).
  • Initialise search indices using the UpdateSearchIndexConfig maintenance script.

What happens?:
An error occurs due to the trigram_anchored analyser requiring the add_regex_start_end_anchors char filter.

What should have happened instead?:
The indices are created.

Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
MediaWiki 1.45 / master, ElasticSearch 7.10.

Other information (browser name/version, screenshots, etc.):
The ES/OS version check in UpdateOneSearchIndexConfig still passes with ElasticSearch 7.10, but Ibe8b0d9962ff1d70bba1000e5b8f1152c271d017 broke compatibility by unconditionally adding a dependency on the current OpenSearch plugin.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
pfischer subscribed.

We'll update the install instructions and docs (READMEs, mediawiki.org, wikitech, …) accordingly.

pfischer set the point value for this task to 2.Nov 3 2025, 4:50 PM

Change #1201789 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] Clarify supported elastic/open search versions

https://gerrit.wikimedia.org/r/1201789

Change #1201790 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@master] regex: Disable anchored regex support on elasticsearch

https://gerrit.wikimedia.org/r/1201790

EBernhardson added subscribers: dcausse, EBernhardson.

Talked this over with @dcausse. We agreed we should continue to support Elasticsearch in REL1_45. We are adding a workaround for this bug with the regex support, and will add warnings that will be displayed when running scripts that manage search indexes whenever the indexes exist on an elasticsearch instance. The intention is to only support OpenSearch in REL1_46 and beyond.

Change #1201789 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Clarify supported elastic/open search versions

https://gerrit.wikimedia.org/r/1201789

Change #1202739 had a related patch set uploaded (by DCausse; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@REL1_45] Clarify supported elastic/open search versions

https://gerrit.wikimedia.org/r/1202739

Change #1202741 had a related patch set uploaded (by DCausse; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@REL1_45] regex: Disable anchored regex support on elasticsearch

https://gerrit.wikimedia.org/r/1202741

Change #1202739 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@REL1_45] Clarify supported elastic/open search versions

https://gerrit.wikimedia.org/r/1202739

Change #1202741 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@REL1_45] regex: Disable anchored regex support on elasticsearch

https://gerrit.wikimedia.org/r/1202741

Change #1201790 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] regex: Disable anchored regex support on elasticsearch

https://gerrit.wikimedia.org/r/1201790

There is still a remaining problem with the query-time highlighter. The elasticsearch highlighter doesn't support the lucene_anchored flavor so we would need to always request lucene on elasticsearch. We are really trying to avoid extra query-time round trips to the server to determine the version information though, still pondering appropriate solution. It might be the only reasonable way is to remove anchored trigram support from REL1_45

It might be the only reasonable way is to remove anchored trigram support from REL1_45

yes I think it's the easiest approach, this feature is very new so not something we could possibly break on existing installs.

Change #1204651 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@REL1_45] Revert "regex: Disable anchored regex support on elasticsearch"

https://gerrit.wikimedia.org/r/1204651

Change #1204652 had a related patch set uploaded (by Ebernhardson; author: Ebernhardson):

[mediawiki/extensions/CirrusSearch@REL1_45] Revert "regex: Support extended syntax"

https://gerrit.wikimedia.org/r/1204652

Change #1204651 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@REL1_45] Revert "regex: Disable anchored regex support on elasticsearch"

https://gerrit.wikimedia.org/r/1204651

Change #1204652 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@REL1_45] Revert "regex: Support extended syntax"

https://gerrit.wikimedia.org/r/1204652

This looks to now support ElasticSearch in REL1_45