Page MenuHomePhabricator

Move fine tuning of search configs to mediawiki-config
Closed, ResolvedPublic

Description

Right now, every time we tune Wikidata (and probably main search too?) search profile tuning, such as EntitySearchProfiles.php, we are changing Wikibase or CirrusSearch source code. This is not ideal, since this tuning may be specific to Wikimedia sites and not be useful for other users of Mediawiki/Wikibase code.

So I think we should split these configs in two parts:

  1. Basic "good enough" config that generic Wikibase install could run
  2. Specifically tuned Wikidata config that we use only for Wikidata and that resides in mediawiki-config repo.

This will allow us to fine tune the parameters without having to change main source (and also means these changes can be deployed as config changes, which they are, and not code changes).

This of course bases on the assumption that (1) and (2) are different, if it turns out that they are not and our tuned config is always good for any Wikibase install, then please feel free to decline this task. But we should at least check whether it's the case :)

This may be part of general profile refactoring @dcausse was planning.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

A decent place for profiles has always been a pain and I could not find something sane. I'd like to address (improve) this problem adding a ProfileManager in cirrus.

The goal would be:

  • have sane defaults (generic profiles) provided in cirrus and/or wikibase that do not pollute the config
  • allow easy customizations both by extension developpers and wikiadmins (e.g. today extension developers have to munge the config in various places to inject their profiles)
  • tentative: try not to pollute wmf-config with verbose and unreadable configs but only set a reference to the profile you want to use (like we do for ltr)

At a glance I'd see 3 possible stores we could use to save the profiles "blobs".

  • In the code itself for generic profiles, cirrus would provide sane defaults for a "standard wiki", wikibase would provide sane defaults as well
  • In the cirrus config by using existing config keys provided by cirrus, would be empty by default
  • In the extension config, the extension could expose config vars to tune (this ticket)

Then I'd like to explore the possibility to store profile blobs in the cirrus metastore index. The reason is that these profiles (generally a set of weigths) are vebose and tends to pollute the php code in the wmf-config. Basically I'm tempted to consider them as a "data" and not a config.

I would also add "be compatible with extension.json" as a goal.

The reason is that these profiles (generally a set of weigths) are vebose and tends to pollute the php code in the wmf-config

I think we could have separate config files in config repo if the profiles grow big.
OTOH, generally speaking, having too many parameters that need to be tuned makes it hard to manage in general. No solutions here, yet, just thinking about it...

Change 419363 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/Wikibase@master] Make cirrus function chains tunable by config

https://gerrit.wikimedia.org/r/419363

Change 419367 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase

https://gerrit.wikimedia.org/r/419367

Change 419363 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Make cirrus function chains tunable by config

https://gerrit.wikimedia.org/r/419363

Change 424303 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/Wikibase@master] [elastic] Properly test rescore settings in fulltext search

https://gerrit.wikimedia.org/r/424303

Change 424349 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/Wikibase@master] [elastic] Drop rescore function hook

https://gerrit.wikimedia.org/r/424349

Change 424303 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] [elastic] Properly test rescore settings in fulltext search

https://gerrit.wikimedia.org/r/424303

Change 424349 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] [elastic] Drop rescore function hook

https://gerrit.wikimedia.org/r/424349

It looks like all the supporting code has shipped and deployed, we need only to deploy the config patch now? https://gerrit.wikimedia.org/r/419367

I understand we need to split that patch in two according to new deployment rules?

Yes I'll take care of that soon.

Change 441056 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (2/3)

https://gerrit.wikimedia.org/r/441056

Change 441057 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (3/3)

https://gerrit.wikimedia.org/r/441057

Change 419367 merged by jenkins-bot:
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (1/3)

https://gerrit.wikimedia.org/r/419367

Change 441056 merged by jenkins-bot:
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (2/3)

https://gerrit.wikimedia.org/r/441056

moving back to in progress as the second patch generated some warnings on test servers:

[Wed Jun 27 13:49:51 2018] [hphp] [482:7f0a5afff700:37030:000001] [] \nWarning: Invalid argument supplied for foreach() in /srv/mediawiki/php-1.32.0-wmf.8/extensions/CirrusSearch/includes/Search/Rescore/TermBoostScoreBuilder.php on line 32

Change 442317 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (1.5/3)

https://gerrit.wikimedia.org/r/442317

Change 442318 had a related patch set uploaded (by DCausse; owner: DCausse):
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (2/3) (take 2)

https://gerrit.wikimedia.org/r/442318

Change 442317 merged by jenkins-bot:
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (1.5/3)

https://gerrit.wikimedia.org/r/442317

Change 442318 merged by jenkins-bot:
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (2/3) (take 2)

https://gerrit.wikimedia.org/r/442318

Change 441057 merged by jenkins-bot:
[operations/mediawiki-config@master] Add cirrussearch settings for wikibase (3/3)

https://gerrit.wikimedia.org/r/441057