Page MenuHomePhabricator

Unexpected Behavior: Unable to find items by mul terms
Closed, ResolvedPublic3 Estimated Story PointsBUG REPORT

Description

Currently, if an item has two labels—one in the mul language (which acts as the default for all languages) and another in Spanish (or any other specific language)—searching for the mul label using any language except Spanish will return an empty result, even though the item exists and has a label in mul.

Example Item

  • Item: Q777 on Beta Wikidata
  • Labels:
    • mul: mul-label
    • es: multietiqueta

Example querying using simple search endpoint:

Search query:

https://wikidata.beta.wmflabs.org/w/rest.php/wikibase/v0/search/items?language=en&q=mul-label

Search parameters:

  • Query: mul-label
  • Language: en

Result:

{
  "results": []
}

Same example with using wbsearchentities:

Search query:

https://wikidata.beta.wmflabs.org/w/api.php?action=wbsearchentities&format=json&search=mul-label&language=en&formatversion=2

Search parameters:

  • search: mul-label
  • Language: en

Result:

{
	"searchinfo": {
		"search": "mul-label"
	},
	"search": [],
	"success": 1
}

Example Querying the Spanish label with using simple search endpoint:

Search query:

https://wikidata.beta.wmflabs.org/w/rest.php/wikibase/v0/search/items?language=en&q=multietiqueta

Search parameters:

  • Query: multietiqueta
  • Language: en

Result:

{
  {
	"results": [
		{
			"id": "Q777",
			"display-label": {
				"language": "mul",
				"value": "mul-label"
			},
			"description": {
				"language": "en",
				"value": "XZwcurkjRsNhaBuFuwka"
			},
			"match": {
				"type": "label",
				"language": "es",
				"text": "multietiqueta"
			}
		}
	]
}
}

Summary

  • Labels in the mul language are not directly searchable using either the REST simple search endpoints or the wbsearchentities.
  • However, a mul label can appear as the display label in the search results if the item is matched via another language label (e.g. es), and mul is selected as the best match for display (due to fallback rules).

Expected behavior: Labels in mul should be searchable regardless of the language.


A similar wbsearchentities request does find items by their mul labels on wikidata.org (production). The search index there contains mul: https://www.wikidata.org/w/api.php?action=cirrus-mapping-dump&format=json&formatversion=2
whereas the one for beta wikidata doesn't: https://wikidata.beta.wmflabs.org/w/api.php?action=cirrus-mapping-dump&format=json&formatversion=2

Event Timeline

Dima_Koushha_WMDE renamed this task from Unexpected Behavior: Unable to search mul terms with InLabelSearch (Elasticsearch) to Unexpected Behavior: Unable to search mul terms with InLabelSearch (Elasticsearch) or wbsearchentities.Apr 16 2025, 12:13 PM
Dima_Koushha_WMDE renamed this task from Unexpected Behavior: Unable to search mul terms with InLabelSearch (Elasticsearch) or wbsearchentities to Unexpected Behavior: Unable to search mul terms with InLabelSearch (Elasticsearch).
Dima_Koushha_WMDE updated the task description. (Show Details)
Dima_Koushha_WMDE renamed this task from Unexpected Behavior: Unable to search mul terms with InLabelSearch (Elasticsearch) to Unexpected Behavior: Unable to search mul terms with `WikibaseEntitySearcher`.Apr 16 2025, 12:17 PM
Jakob_WMDE renamed this task from Unexpected Behavior: Unable to search mul terms with `WikibaseEntitySearcher` to Unexpected Behavior: Unable to find items by mul terms.Apr 17 2025, 10:32 AM
Jakob_WMDE updated the task description. (Show Details)
WMDE-leszek set the point value for this task to 2.Apr 24 2025, 10:08 AM
WMDE-leszek moved this task from Polished to Ready for planning on the Wikibase Reuse Team board.
WMDE-leszek changed the point value for this task from 2 to 3.

Change #1143056 had a related patch set uploaded (by Jakob; author: Jakob):

[mediawiki/extensions/EntitySchema@master] Remove labels/descriptions search fields config

https://gerrit.wikimedia.org/r/1143056

Change #1143780 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseCirrusSearch@master] Refine LabelsField+DescriptionsField::merge()

https://gerrit.wikimedia.org/r/1143780

Change #1143780 merged by jenkins-bot:

[mediawiki/extensions/WikibaseCirrusSearch@master] Refine LabelsField+DescriptionsField::merge()

https://gerrit.wikimedia.org/r/1143780

Mentioned in SAL (#wikimedia-releng) [2025-05-13T12:35:11Z] <dcausse> deployment-prep: reindexing wikidata to pickup the "mul" language field (T392058)

Change #1143056 abandoned by Silvan Heintze:

[mediawiki/extensions/EntitySchema@master] Remove labels/descriptions search fields config

Reason:

in favour of I045fba0ed63338f63be0c6c4be124e313bc0b0d5

https://gerrit.wikimedia.org/r/1143056