Page MenuHomePhabricator

Bundle Extension:CirrusSearch with MediaWiki core
Closed, DeclinedPublicFeature

Description

The default MediaWiki search functionality is pretty meh. WMF wikis use CirrusSearch to improve this.

https://www.mediawiki.org/wiki/Extension:CirrusSearch

The CirrusSearch extension implements searching for MediaWiki using Elasticsearch.

Checklist:

(note unticked items might need to be checked; they haven't because OP doesn't really understand their meanings).

  • Passed discussion or already Wikimedia deployed
  • Passed security review or already Wikimedia deployed
  • Voting CI structure tests
  • Runs MediaWiki-CodeSniffer
  • Runs phan
  • Supports MySQL, SQLite, and Postgres (if there are schema changes)
  • GPL v2 or later compatible license
  • Extension's default configuration provides optimal experience
  • Tested with web installer
  • Any relevant dependencies also bundled

(subtask of T333405)

Event Timeline

CirrusSearch can not do anything without ElasticSearch. What's worse, this extension is bound to a specific (and EOLed) ElasticSearch version.

Reedy lowered the priority of this task from Medium to Low.Apr 16 2025, 1:40 PM

I can pretty confidently say that this is never going to happen, or at least not any time soon, as it's contrary to the entire concept of MediaWiki's release strategy as a tarball that can be used without non-PHP binaries. See also server-side Java or JS requirements, etc.

So what are the plans to fix the search issues?

The default MediaWiki setup is a bit rubbish, and "the improved WMF version" seems hackish.

Several extensions depend on (or are enhanced by) Cirrussearch, which depends on ElasticaSearch. It's a bit of a mess.

In the real world, all the search engines are integrating AI into their search. How do you plan to keep up?

So what are the plans to fix the search issues?

Those are filed against MediaWiki-Search and MediaWiki releases on the database full text search system. That is a good first basic approach which is good enough for small wikis. More importantly, it works out of the box with no further work/system.

The default MediaWiki setup is a bit rubbish, and "the improved WMF version" seems hackish.
Several extensions depend on (or are enhanced by) Cirrussearch, which depends on ElasticaSearch. It's a bit of a mess.

The next tier, which is what Wikimedia Foundation is doing, is to setup an Elastica / OpenSearch cluster and use the Elastica/CirrusSearch extensions on MediaWiki. This needs more work than the default search provided by the Database.

So even if CirrusSearch and Elastica were included in the tarball, that is not going to enhance the search out of the box. If one is able to setup and maintain an ElasticSearch / OpenSearch cluster, then I imagine they can surely maintained a MediaWiki installation with a couple more extensions and some additional configuration tuning.

Our setup is certainly complex, it is maintained by a dedicated team, and we are on an entirely different scale than a personal or small organization wiki. If you have issue setting it up, I guess you can file tasks against CirrusSearch.

In the real world, all the search engines are integrating AI into their search. How do you plan to keep up?

In the real world, the whole industry is a decade behind the latest trendy news you can see on Twitter, Hackers News or whatever cutting edge source of information you might be used. In the real world people are still running production MediaWiki with PHP 5 and they still use vanilla JavaScript or JQuery rather than React or whatever framework happens to be fancy this day.

I don't know about the exact plan the foundation has for search, but in the current annual plan search is mentioned as evolving and there is a mention of AI machine generated content. So yeah for sure AI is known (and used).

Anyway, including the extensions by default is not going to make search nicer out of the box and thus it makes little sense to add CirrusSearch to the tarball.