Page MenuHomePhabricator

Document how to replace or update "text", "plain" and other analyzers
Open, LowPublic


User story: As a non-WMF user of MediaWiki full-text search, I want to be able to configure custom analysis chains that are more appropriate for my use case.

This issue came up in a discussion with @Svrl on the Cirrus help talk page.

For example: while you can specify $wgLanguageCode = 'cs';, that only allows you to enable the same specific analysis chain as used on cswiki. If we change that analysis on our end, it also changes for external users when they upgrade MediaWiki. If you want to do something different (like using the Czech stemmer + ICU folding), you can't easily do so (it may be possible with lots of hacking and manual maintenance, but that's sub-optimal).

@dcausse & @TJones discussed this some, and @dcausse found a way to inject config to update or replace a language-specific configuration. An example config is done for Czech P13907. This should be documented in our on-wiki docs.

Acceptance Criteria:

  • Update appropriate documentation page(s) on-wiki with general method for doing this, and at least one specific example. Should be reviewed by another search developer.

Event Timeline

Gehel triaged this task as Low priority.Nov 23 2020, 4:22 PM
Gehel moved this task from needs triage to making others happy on the Discovery-Search board.
TJones renamed this task from Investigate adding a simple hook to replace "text" and "plain" analyzers to Document how to replace or update "text", "plain" and other analyzers.Jan 25 2021, 5:45 PM
TJones updated the task description. (Show Details)

Since David found a way to hook into the system and change or replace analyzers, and that seems to meet the spirit of the original ticket, I've updated this to a documentation task.