Page MenuHomePhabricator

Request for new search profile for Wikidata that boosts Items for languages
Open, Needs TriagePublic

Description

Problem:
The Wikidata Team is redoing the Special:NewLexeme page. One improvement we want to make is boosting languages in the selector where editors indicate the language of the Lexeme they are creating (See T298140). The language selector currently is a normal entity selector that searches through all Items in Wikidata. We'd like to have a new search profile that boosts Items representing languages in order to make selecting languages easier.

Screenshots:
From the current Special:NewLexeme page:

image.png (578×781 px, 48 KB)

Acceptance criteria:

  • a search profile is available that the Wikidata Team can use in the new Special:NewLexeme page that boosts languages

Ideas for how to determine which Items to boost:

Notes:

  • We still want Items not representing languages to be included. They should just be ranked lower.

Details

ProjectBranchLines +/-Subject
mediawiki/extensions/CirrusSearchmaster+5 -4
operations/mediawiki-configmaster+4 -4
mediawiki/extensions/WikibaseCirrusSearchwmf/1.39.0-wmf.17+1 -1
mediawiki/extensions/WikibaseCirrusSearchmaster+1 -1
mediawiki/extensions/WikibaseCirrusSearchwmf/1.39.0-wmf.17+4 -6
operations/mediawiki-configmaster+1 -1
mediawiki/extensions/WikibaseCirrusSearchmaster+4 -6
operations/mediawiki-configmaster+80 -0
mediawiki/extensions/WikimediaMessagesmaster+4 -0
mediawiki/extensions/WikibaseCirrusSearchmaster+0 -1
mediawiki/extensions/WikibaseCirrusSearchwmf/1.39.0-wmf.17+1 -1
mediawiki/extensions/WikibaseCirrusSearchmaster+1 -1
operations/mediawiki-configmaster+17 -0
operations/mediawiki-configmaster+80 -0
mediawiki/extensions/Wikibasemaster+76 -21
mediawiki/extensions/Wikibasemaster+10 -18
mediawiki/extensions/WikibaseCirrusSearchmaster+3 -1
operations/mediawiki-configmaster+5 -0
mediawiki/extensions/Wikibasemaster+116 -40
mediawiki/extensions/PropertySuggestermaster+0 -1
mediawiki/extensions/WikibaseLexemeCirrusSearchmaster+7 -4
mediawiki/extensions/PropertySuggestermaster+3 -1
mediawiki/extensions/WikibaseCirrusSearchmaster+52 -7
mediawiki/extensions/WikibaseLexemeCirrusSearchmaster+47 -0
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 801791 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/WikibaseLexemeCirrusSearch@master] Add new config options to tune Special:NewLexeme...

https://gerrit.wikimedia.org/r/801791

Change 801793 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] [cirrus] Add a custom profile for Special:NewLexeme

https://gerrit.wikimedia.org/r/801793

ItamarWMDE changed the task status from Open to Stalled.Jun 1 2022, 8:32 AM
ItamarWMDE added subscribers: dcausse, ItamarWMDE.

Will schedule a pairing session with @dcausse from the search platform team to proceed with this.

The patches above add few placeholder to allow tuning a custom profile meant to be use by the language selector on Special:NewLexeme:

The open question is how to pass a new URI param added to wbsearchentities meant to switch between profiles back to EntitySearchElastic.

The patches above add few placeholder to allow tuning a custom profile meant to be use by the language selector on Special:NewLexeme:

Thanks! I would suggest to keep Special:NewLexeme out of the context name and options as much as possible, and only call it something like “language” or “language selector”. My understanding is that, in theory, we would want to use the same profile when editing the language of an existing lexeme (though I don’t know if implementing that is on any roadmap at the moment).

The open question is how to pass a new URI param added to wbsearchentities meant to switch between profiles back to EntitySearchElastic.

I would add it to the EntitySearchHelper::getRankedSearchResults() parameters (probably at the end, so it’s easier to do without breaking compatibility); non-CirrusSearch implementations of that interface would just ignore it.

The patches above add few placeholder to allow tuning a custom profile meant to be use by the language selector on Special:NewLexeme

Thanks for the patches, can't say I fully understand all of them, but they look like a good start. Otherwise, I second what Lucas mentioned re: passing the params from the URI, and also the config naming. Do you still want to organize some kind of call where you can walk us through these changes?

Change 801791 abandoned by DCausse:

[mediawiki/extensions/WikibaseLexemeCirrusSearch@master] Add new config options to tune Special:NewLexeme...

Reason:

moved to WikibaseCirrusSearch

https://gerrit.wikimedia.org/r/801791

Change 804344 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/Wikibase@master] EntitySearchHelper: pass a "profile context" to the engine

https://gerrit.wikimedia.org/r/804344

After a quick discussion w. @dcausse, I will finish the patches re: new search profile. After those are merged and deployed, we can then add a parameter to wbsearchentities to switch between profiles.

ItamarWMDE changed the task status from Stalled to In Progress.Fri, Jun 10, 9:30 AM
ItamarWMDE claimed this task.
ItamarWMDE moved this task from Incoming to Doing on the User-ItamarWMDE board.

Change 805844 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseLexemeCirrusSearch@master] Add $profileContext to getRankedSearchResults()

https://gerrit.wikimedia.org/r/805844

Change 806178 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/PropertySuggester@master] Pass $profileContext into EntitySearchHelper

https://gerrit.wikimedia.org/r/806178

Change 806179 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/PropertySuggester@master] Remove no-longer-used Phan suppression

https://gerrit.wikimedia.org/r/806179

For the API parameter, we’ll have a config setting defining the possible API values and mapping them to the underling profile context value. This means it’s the administrator’s responsibility to make the API parameter useful (just like it’s already their responsibility to configure WikibaseCirrusSearch more generally), and to configure a value that the installed search extensions will understand. By default, the mapping would be:

$wgWBRepoSettings['searchProfiles'] = [ // name up for discussion
	'default' => null,
];

So a default Wikibase install would always pass null into the EntitySearchHelper, leaving it to the search implementation to interpret this profile however it wants. (Non-CirrusSearch implementations would ignore it at the moment.) In production, we would configure:

$wgWBRepoSettings['searchProfiles'] = [
	'default' => null,
	'language' => 'language_selector_prefix', // \Wikibase\Search\Elastic\Hooks::LANGUAGE_SELECTOR_PREFIX
];

We would not explicitly include the value that’s equivalent to null (EntitySearchElastic::CONTEXT_WIKIBASE_PREFIX = 'wikibase_prefix_search'). Just default and language.

Change 806386 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Add profile parameter to entity search APIs

https://gerrit.wikimedia.org/r/806386

Change 806178 merged by jenkins-bot:

[mediawiki/extensions/PropertySuggester@master] Pass $profileContext into EntitySearchHelper

https://gerrit.wikimedia.org/r/806178

Change 805844 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexemeCirrusSearch@master] Add $profileContext to getRankedSearchResults()

https://gerrit.wikimedia.org/r/805844

Change 804344 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] EntitySearchHelper: pass a "profile context" to the engine

https://gerrit.wikimedia.org/r/804344

Change 806179 merged by jenkins-bot:

[mediawiki/extensions/PropertySuggester@master] Remove no-longer-used Phan suppression

https://gerrit.wikimedia.org/r/806179

To roll out this new feature, we’ll temporarily make the wbsearchentities API only expose the new API parameter if it’s been configured in the production config (i.e. when there’s another possible value besides default). This way, we can let the wbsearchentities change roll out with the train, and then control using only config changes when the new parameter is available on which wikis. So the current plan for deploying this feature is:

  • the Wikibase EntitySearchHelper interface changes are merged and roll out with the train – no change yet
  • the config change for the new profile is deployed – no change yet
  • wbsearchentities adds a new parameter iff more than one search profile is configured, to roll out with the train – no change yet
  • we configure two API search profiles – observable API change, deployment coordinated with significant change announcement, probably deployed to Test Wikidata two weeks before real Wikidata
  • later, wbsearchentities always adds the new parameter – no change in production (but at this point the parameter becomes available on other wikis too)

We’ll start working on the mentioned significant change announcement once the EntitySearchHelper changes and the config change for the new profile have been deployed to production. (The wbsearchentities change probably won’t make it into this week’s train, but that should be fine.)

Change 806927 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Make profile search parameter available unconditionally

https://gerrit.wikimedia.org/r/806927

Change 806929 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikimediaMessages@master] Add messages for language entity search profile

https://gerrit.wikimedia.org/r/806929

Change 806930 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Configure wbsearchentities profile parameter on Test Wikidata

https://gerrit.wikimedia.org/r/806930

Change 806931 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Configure wbsearchentities profile parameter on Wikidata

https://gerrit.wikimedia.org/r/806931

Change 806932 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseCirrusSearch@master] Pass $searchProfiles into SearchEntities API

https://gerrit.wikimedia.org/r/806932

Change 806933 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseCirrusSearch@master] Remove no-longer-used Phan suppression

https://gerrit.wikimedia.org/r/806933

Change 801793 merged by jenkins-bot:

[operations/mediawiki-config@master] [cirrus] Add a custom profile for the wikibase language selector

https://gerrit.wikimedia.org/r/801793

Change 801793 merged by jenkins-bot:

[operations/mediawiki-config@master] [cirrus] Add a custom profile for the wikibase language selector

https://gerrit.wikimedia.org/r/801793

I’m afraid this needs a bit more work:

lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en --profile-context language_selector_prefix <<<Engl
Please input search terms...
CirrusSearch\Profile\SearchProfileException from line 303 of /srv/mediawiki/php-1.39.0-wmf.17/extensions/CirrusSearch/includes/Profile/SearchProfileService.php: A profile repository type rescore_function_chains named wikibase_config is already registered.
#0 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/Hooks.php(281): CirrusSearch\Profile\SearchProfileService->registerRepository(Object(CirrusSearch\Profile\SearchProfileRepositoryTransformer))
#1 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/Hooks.php(145): Wikibase\Search\Elastic\Hooks::registerSearchProfiles(Object(CirrusSearch\Profile\SearchProfileService), Object(Wikibase\Search\Elastic\WikibaseSearchConfig), Array)
#2 /srv/mediawiki/php-1.39.0-wmf.17/includes/HookContainer/HookContainer.php(338): Wikibase\Search\Elastic\Hooks::onCirrusSearchProfileService(Object(CirrusSearch\Profile\SearchProfileService))
#3 /srv/mediawiki/php-1.39.0-wmf.17/includes/HookContainer/HookContainer.php(137): MediaWiki\HookContainer\HookContainer->callLegacyHook('CirrusSearchPro...', Array, Array, Array)
#4 /srv/mediawiki/php-1.39.0-wmf.17/extensions/CirrusSearch/includes/CirrusSearchHookRunner.php(100): MediaWiki\HookContainer\HookContainer->run('CirrusSearchPro...', Array)
#5 /srv/mediawiki/php-1.39.0-wmf.17/extensions/CirrusSearch/includes/Profile/SearchProfileServiceFactory.php(160): CirrusSearch\CirrusSearchHookRunner->onCirrusSearchProfileService(Object(CirrusSearch\Profile\SearchProfileService))
#6 /srv/mediawiki/php-1.39.0-wmf.17/extensions/CirrusSearch/includes/SearchConfig.php(308): CirrusSearch\Profile\SearchProfileServiceFactory->loadService(Object(CirrusSearch\SearchConfig))
#7 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php(156): CirrusSearch\SearchConfig->getProfileService()
#8 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php(206): Wikibase\Search\Elastic\EntitySearchElastic->loadProfile(Object(CirrusSearch\Search\SearchContext), 'en')
#9 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php(301): Wikibase\Search\Elastic\EntitySearchElastic->getElasticSearchQuery('Engl', 'en', 'item', false, Object(CirrusSearch\Search\SearchContext))
#10 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/includes/Api/CombinedEntitySearchHelper.php(49): Wikibase\Search\Elastic\EntitySearchElastic->getRankedSearchResults('Engl', 'en', 'item', 5, false, 'language_select...')
#11 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/includes/Api/TypeDispatchingEntitySearchHelper.php(48): Wikibase\Repo\Api\CombinedEntitySearchHelper->getRankedSearchResults('Engl', 'en', 'item', 5, false, 'language_select...')
#12 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/maintenance/searchEntities.php(106): Wikibase\Repo\Api\TypeDispatchingEntitySearchHelper->getRankedSearchResults('Engl', 'en', 'item', 5, false, 'language_select...')
#13 /srv/mediawiki/php-1.39.0-wmf.17/includes/OrderedStreamingForkController.php(142): Wikibase\Repo\Maintenance\SearchEntities->doSearch('Engl')
#14 /srv/mediawiki/php-1.39.0-wmf.17/includes/OrderedStreamingForkController.php(69): OrderedStreamingForkController->consumeNoFork()
#15 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/maintenance/searchEntities.php(65): OrderedStreamingForkController->start()
#16 /srv/mediawiki/php-1.39.0-wmf.17/maintenance/includes/MaintenanceRunner.php(309): Wikibase\Repo\Maintenance\SearchEntities->execute()
#17 /srv/mediawiki/php-1.39.0-wmf.17/maintenance/doMaintenance.php(85): MediaWiki\Maintenance\MaintenanceRunner->run()
#18 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/maintenance/searchEntities.php(160): require_once('/srv/mediawiki/...')
#19 /srv/mediawiki/multiversion/MWScript.php(120): require_once('/srv/mediawiki/...')
#20 {main}

I’ll revert it for now.

ItamarWMDE changed the task status from In Progress to Stalled.Wed, Jun 22, 2:55 PM

Change 807978 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/WikibaseCirrusSearch@master] Do not re-use "wikibase_config" for registering the language selector...

https://gerrit.wikimedia.org/r/807978

The above patch should fix the issue, I forgot that profile repositories must have have unique names, sorry about that!

Change 807978 merged by jenkins-bot:

[mediawiki/extensions/WikibaseCirrusSearch@master] Do not re-use "wikibase_config" for registering the language selector...

https://gerrit.wikimedia.org/r/807978

Change 807902 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: DCausse):

[mediawiki/extensions/WikibaseCirrusSearch@wmf/1.39.0-wmf.17] Do not re-use "wikibase_config" for registering the language selector...

https://gerrit.wikimedia.org/r/807902

Change 807902 merged by jenkins-bot:

[mediawiki/extensions/WikibaseCirrusSearch@wmf/1.39.0-wmf.17] Do not re-use "wikibase_config" for registering the language selector...

https://gerrit.wikimedia.org/r/807902

Mentioned in SAL (#wikimedia-operations) [2022-06-23T15:11:26Z] <lucaswerkmeister-wmde@deploy1002> Synchronized php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/Hooks.php: Backport: [[gerrit:807902|Do not re-use "wikibase_config" for registering the language selector... (T307869)]] (duration: 03m 22s)

Change 808011 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] [cirrus] Add a custom profile for the wikibase language selector

https://gerrit.wikimedia.org/r/808011

The above patch should fix the issue, I forgot that profile repositories must have have unique names, sorry about that!

Thanks! I backported it to wmf.17 and scheduled a repeat of the config change for Monday.

Change 808011 merged by jenkins-bot:

[operations/mediawiki-config@master] [cirrus] Add a custom profile for the wikibase language selector

https://gerrit.wikimedia.org/r/808011

Mentioned in SAL (#wikimedia-operations) [2022-06-27T13:12:17Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:808011|[cirrus] Add a custom profile for the wikibase language selector (T307869)]] (1/4) (duration: 03m 35s)

Mentioned in SAL (#wikimedia-operations) [2022-06-27T13:16:08Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:808011|[cirrus] Add a custom profile for the wikibase language selector (T307869)]] (2/4) (duration: 03m 33s)

Mentioned in SAL (#wikimedia-operations) [2022-06-27T13:20:16Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/SearchSettingsForWikibase.php: Config: [[gerrit:808011|[cirrus] Add a custom profile for the wikibase language selector (T307869)]] (3/4) (duration: 03m 32s)

Mentioned in SAL (#wikimedia-operations) [2022-06-27T13:24:04Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/SearchSettingsForWikidata.php: Config: [[gerrit:808011|[cirrus] Add a custom profile for the wikibase language selector (T307869)]] (4/4) (duration: 03m 29s)

It looks like the new profile isn’t fully working yet, at least when used via the maintenance script – I get the same results for “Engl” with or without the maintenance script:

lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en --profile-context language_selector_prefix <<<Engl 2> /dev/null | jq '.rows | .[] | .snippets | { term, type, text }'
{
  "term": "Engl",
  "type": "label",
  "text": "family name"
}
{
  "term": "ENGL",
  "type": "alias",
  "text": "protein-coding gene in the species Homo sapiens"
}
{
  "term": "English",
  "type": "label",
  "text": "West Germanic language"
}
{
  "term": "England",
  "type": "label",
  "text": "country in north-west Europe, part of the United Kingdom"
}
{
  "term": "English Wikipedia",
  "type": "label",
  "text": "English-language edition of Wikipedia"
}
lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en <<<Engl 2> /dev/null | jq '.rows | .[] | .snippets | { term, type, text }'
{
  "term": "Engl",
  "type": "label",
  "text": "family name"
}
{
  "term": "ENGL",
  "type": "alias",
  "text": "protein-coding gene in the species Homo sapiens"
}
{
  "term": "English",
  "type": "label",
  "text": "West Germanic language"
}
{
  "term": "England",
  "type": "label",
  "text": "country in north-west Europe, part of the United Kingdom"
}
{
  "term": "English Wikipedia",
  "type": "label",
  "text": "English-language edition of Wikipedia"
}

Same for “Deu”:

lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en --profile-context language_selector_prefix <<<Deu 2> /dev/null | jq '.rows | .[] | .snippets | { term, type, text }'
{
  "term": "Deutsche Eislauf-Union",
  "type": "label",
  "text": "voluntary association"
}
{
  "term": "Deutschland",
  "type": "alias",
  "text": "country in Central Europe"
}
{
  "term": "Deutsch",
  "type": "alias",
  "text": "West Germanic language spoken mainly in Central Europe"
}
{
  "term": "Deutsche Demokratische Republik",
  "type": "alias",
  "text": "1949–1990 country in central Europe, unified into modern Germany"
}
{
  "term": "Deutsches Kaiserreich",
  "type": "alias",
  "text": "empire in Central Europe between 1871 and 1918"
}
lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en <<<Deu 2> /dev/null | jq '.rows | .[] | .snippets | { term, type, text }'
{
  "term": "Deutsche Eislauf-Union",
  "type": "label",
  "text": "voluntary association"
}
{
  "term": "Deutschland",
  "type": "alias",
  "text": "country in Central Europe"
}
{
  "term": "Deutsch",
  "type": "alias",
  "text": "West Germanic language spoken mainly in Central Europe"
}
{
  "term": "Deutsche Demokratische Republik",
  "type": "alias",
  "text": "1949–1990 country in central Europe, unified into modern Germany"
}
{
  "term": "Deutsches Kaiserreich",
  "type": "alias",
  "text": "empire in Central Europe between 1871 and 1918"
}

Or “Frenc”:

lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en --profile-context language_selector_prefix <<<Frenc 2> /dev/null | jq '.rows | .[] | .snippets | { term, type, text }'
{
  "term": "French Republic",
  "type": "alias",
  "text": "country in Western Europe"
}
{
  "term": "French",
  "type": "label",
  "text": "Romance language"
}
{
  "term": "French Wikipedia",
  "type": "label",
  "text": "French-language edition of Wikipedia"
}
{
  "term": "French Revolution",
  "type": "label",
  "text": "1789 to 1799 social and political revolution in France"
}
{
  "term": "French Guiana",
  "type": "label",
  "text": "Overseas department of France in South America"
}
lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en <<<Frenc 2> /dev/null | jq '.rows | .[] | .snippets | { term, type, text }'
{
  "term": "French Republic",
  "type": "alias",
  "text": "country in Western Europe"
}
{
  "term": "French",
  "type": "label",
  "text": "Romance language"
}
{
  "term": "French Wikipedia",
  "type": "label",
  "text": "French-language edition of Wikipedia"
}
{
  "term": "French Revolution",
  "type": "label",
  "text": "1789 to 1799 social and political revolution in France"
}
{
  "term": "French Guiana",
  "type": "label",
  "text": "Overseas department of France in South America"
}

It looks like this isn’t a bug in the maintenance script, the profile context is at least making it to CirrusSearch – if I put in a wrong value, I get an error:

lucaswerkmeister-wmde@mwdebug1001:~$ mwscript extensions/Wikibase/repo/maintenance/searchEntities.php wikidatawiki --entity-type item --language en --profile-context unknown_profile_context <<<Engl
Please input search terms...
CirrusSearch\Profile\SearchProfileException from line 273 of /srv/mediawiki/php-1.39.0-wmf.17/extensions/CirrusSearch/includes/Profile/SearchProfileService.php: No default profile found for wikibase_prefix_querybuilder in context unknown_profile_context
#0 /srv/mediawiki/php-1.39.0-wmf.17/extensions/CirrusSearch/includes/Profile/SearchProfileService.php(258): CirrusSearch\Profile\SearchProfileService->getProfileName('wikibase_prefix...', 'unknown_profile...', Array)
#1 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php(158): CirrusSearch\Profile\SearchProfileService->loadProfile('wikibase_prefix...', 'unknown_profile...', NULL, Array)
#2 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php(206): Wikibase\Search\Elastic\EntitySearchElastic->loadProfile(Object(CirrusSearch\Search\SearchContext), 'en')
#3 /srv/mediawiki/php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/EntitySearchElastic.php(301): Wikibase\Search\Elastic\EntitySearchElastic->getElasticSearchQuery('Engl', 'en', 'item', false, Object(CirrusSearch\Search\SearchContext))
#4 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/includes/Api/CombinedEntitySearchHelper.php(49): Wikibase\Search\Elastic\EntitySearchElastic->getRankedSearchResults('Engl', 'en', 'item', 5, false, 'unknown_profile...')
#5 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/includes/Api/TypeDispatchingEntitySearchHelper.php(48): Wikibase\Repo\Api\CombinedEntitySearchHelper->getRankedSearchResults('Engl', 'en', 'item', 5, false, 'unknown_profile...')
#6 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/maintenance/searchEntities.php(106): Wikibase\Repo\Api\TypeDispatchingEntitySearchHelper->getRankedSearchResults('Engl', 'en', 'item', 5, false, 'unknown_profile...')
#7 /srv/mediawiki/php-1.39.0-wmf.17/includes/OrderedStreamingForkController.php(142): Wikibase\Repo\Maintenance\SearchEntities->doSearch('Engl')
#8 /srv/mediawiki/php-1.39.0-wmf.17/includes/OrderedStreamingForkController.php(69): OrderedStreamingForkController->consumeNoFork()
#9 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/maintenance/searchEntities.php(65): OrderedStreamingForkController->start()
#10 /srv/mediawiki/php-1.39.0-wmf.17/maintenance/includes/MaintenanceRunner.php(309): Wikibase\Repo\Maintenance\SearchEntities->execute()
#11 /srv/mediawiki/php-1.39.0-wmf.17/maintenance/doMaintenance.php(85): MediaWiki\Maintenance\MaintenanceRunner->run()
#12 /srv/mediawiki/php-1.39.0-wmf.17/extensions/Wikibase/repo/maintenance/searchEntities.php(160): require_once('/srv/mediawiki/...')
#13 /srv/mediawiki/multiversion/MWScript.php(120): require_once('/srv/mediawiki/...')
#14 {main}

Change 808903 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] Do not set wgWBCSLanguageSelectorRescoreProfile twice

https://gerrit.wikimedia.org/r/808903

Change 808904 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/WikibaseCirrusSearch@master] Use WBCS config when registering language selector profile

https://gerrit.wikimedia.org/r/808904

Sorry about that, there was yet another issue in the WikibaseCirrusSearch Hook that caused the config to be ignored and caused the language selector profile context to simply use exactly the same settings as the classic entity completion search.
There was also a typo in mw-config fixed in one the attached patch.

Change 808904 merged by jenkins-bot:

[mediawiki/extensions/WikibaseCirrusSearch@master] Use WBCS config when registering language selector profile

https://gerrit.wikimedia.org/r/808904

Change 808445 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: DCausse):

[mediawiki/extensions/WikibaseCirrusSearch@wmf/1.39.0-wmf.17] Use WBCS config when registering language selector profile

https://gerrit.wikimedia.org/r/808445

Change 808903 merged by jenkins-bot:

[operations/mediawiki-config@master] Do not set wgWBCSLanguageSelectorRescoreProfile twice

https://gerrit.wikimedia.org/r/808903

Mentioned in SAL (#wikimedia-operations) [2022-06-27T15:15:13Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/SearchSettingsForWikidata.php: Config: [[gerrit:808903|Do not set wgWBCSLanguageSelectorRescoreProfile twice (T307869)]] (duration: 03m 41s)

Change 808445 merged by jenkins-bot:

[mediawiki/extensions/WikibaseCirrusSearch@wmf/1.39.0-wmf.17] Use WBCS config when registering language selector profile

https://gerrit.wikimedia.org/r/808445

Mentioned in SAL (#wikimedia-operations) [2022-06-27T15:32:17Z] <lucaswerkmeister-wmde@deploy1002> Synchronized php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/Hooks.php: Backport: [[gerrit:808445|Use WBCS config when registering language selector profile (T307869)]] (duration: 03m 38s)

Change 808941 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/mediawiki-config@master] Increase weights on the language selector statement boosts

https://gerrit.wikimedia.org/r/808941

Change 808942 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/WikibaseCirrusSearch@master] Use LanguageSelectorStatementBoost instead of its plurar form

https://gerrit.wikimedia.org/r/808942

Change 808942 merged by jenkins-bot:

[mediawiki/extensions/WikibaseCirrusSearch@master] Use LanguageSelectorStatementBoost instead of its plurar form

https://gerrit.wikimedia.org/r/808942

Change 809118 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: DCausse):

[mediawiki/extensions/WikibaseCirrusSearch@wmf/1.39.0-wmf.17] Use LanguageSelectorStatementBoost instead of its plurar form

https://gerrit.wikimedia.org/r/809118

Change 809118 merged by jenkins-bot:

[mediawiki/extensions/WikibaseCirrusSearch@wmf/1.39.0-wmf.17] Use LanguageSelectorStatementBoost instead of its plurar form

https://gerrit.wikimedia.org/r/809118

Mentioned in SAL (#wikimedia-operations) [2022-06-28T13:03:28Z] <lucaswerkmeister-wmde@deploy1002> Synchronized php-1.39.0-wmf.17/extensions/WikibaseCirrusSearch/src/Hooks.php: Backport: [[gerrit:809118|Use LanguageSelectorStatementBoost instead of its plurar form (T307869)]] (duration: 03m 35s)

Change 809209 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/CirrusSearch@master] Construct a match query from TermBoostScoreBuilder

https://gerrit.wikimedia.org/r/809209

There is yet another problem (see patch above that should fix it). I'm sorry that deploying this profile is such a pain, it demonstrates a clear problem in the way we (the search team) deploy such features/profiles and I filed T311528 to discuss and hopefully improve the situation.

Question from @Lea_WMDE and @Evelien_WMDE from today's LOD sync: will this have any influence on other Wikibase installations? Since they are having Elastic issues in wbcloud they want to make sure it's not getting worse for them accidentally due to us creating a new profile for all other Wikibase installations that use the new Lexeme creation page.

No new profiles should be created for other wikibase installation as most of the wikidata specific options are managed in wmf specific config, not Wikibase nor CirrusSearch so the new Lexeme creation page should behave exactly as before.
All the fixes we had to make in CirrusSearch should not impact anything except if other Wikibase installations had tuned such broken settings (but I doubt since they were totally broken and ineffective)

The bugfix that might affect Wikibase installations relying on CirrusSearch&Elastic is:

  • Fixed the handling of the configuration variable wgWBCSStatementBoost which was ignored.

@Lea_WMDE @Evelien_WMDE do you have a link to such problems with Elastic in wbcloud?

I think this would depend on which versions of the CirrusSearch and WikbaseCirrusSearch extensions are used in those Wikibase installations. but IIUC this change should be non breaking. @dcausse please correct me if I'm wrong

Change 808941 merged by jenkins-bot:

[operations/mediawiki-config@master] Increase weights on the language selector statement boosts

https://gerrit.wikimedia.org/r/808941

Mentioned in SAL (#wikimedia-operations) [2022-06-29T15:51:40Z] <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:808941|Increase weights on the language selector statement boosts (T307869)]] (expected to be a no-op) (duration: 03m 21s)

Change 809209 merged by jenkins-bot:

[mediawiki/extensions/CirrusSearch@master] Construct a match query from TermBoostScoreBuilder

https://gerrit.wikimedia.org/r/809209