Page MenuHomePhabricator

Search for L7 shows incomplete drop-down box
Closed, ResolvedPublic

Description

On Wikidata.org, search for a Lexeme with the LID, e.g. L7

Problem:
The drop-down box shows two results, literally "L7" (instead of "cat, English noun (L7)") and L73385 (brusić).
Why does it not show the lemma for L7? What about L73?

Example:
L7

Screenshots/mockups:

Screenshot 2025-09-24 at 16.33.42.png (257×523 px, 29 KB)

Acceptance+criteria:
*show lemma etc for L7

Open+questions:
Not sure what to do about the other lemmas that start with L7 - do we really want to show any of those? If yes, probably a better selection than just L733385?

Event Timeline

Additional info: when testing this we also saw issues for Items, not just Lexemes.

This is supported by the "classic" search APIs: e.g.: https://www.wikidata.org/w/api.php?action=opensearch&search=L7&namespace=146 but I understand that we also want this behavior in wbsearchentities and other wikibase specific prefix search APIs.
This should not be hard to do but I wonder why if there was a specific reason not to include those in the first place.

The fact that L733385 is shown is I think a data issue, the L733385-F1 form is the literal L733385.

Sorry I think I misunderstood the bug report.

  • When matching the entity/lexeme ID
    • Searching items: e.g. Q2 -> Earth is displayed, @Lydia_Pintscher have you seen cases where only the QID was displayed?
    • Searching lexemes: e.g. L7 -> only L7 is shown but we want cat (L7) - English noon to be shown
  • Searching for L7 showing L733385 is a data issue
  • Open question: do we want to augment the search results with item/lexemes whose ID starts with the searched term?
  • Searching items: e.g. Q2 -> Earth is displayed, @Lydia_Pintscher have you seen cases where only the QID was displayed?

Yes. I saw it when Denny was sharing his screen. It also later worked fine for the same search. I was thinking there is some timing issue, getting responses from different servers or something like that.
But maybe those are unrelated issues?

  • Searching items: e.g. Q2 -> Earth is displayed, @Lydia_Pintscher have you seen cases where only the QID was displayed?

Yes. I saw it when Denny was sharing his screen. It also later worked fine for the same search. I was thinking there is some timing issue, getting responses from different servers or something like that.
But maybe those are unrelated issues?

Possibly, in the last couple days (right after the switch-over) we experienced serious perf issues that caused many queries to fail. It is possible that in that case the fallback we do using a simple database lookup did not augment its results with the item's label.

Gehel subscribed.

Tagging Wikidata-Query-Service to give visibility to the Wikidata Platform team.

Needs some investigation by search team

BTracy-WMF triaged this task as Medium priority.Oct 6 2025, 9:17 PM
BTracy-WMF moved this task from Incoming to Operations/SRE on the Wikidata-Query-Service board.
TJones renamed this task from Search for L7 has shows incomplete drop-down box to Search for L7 shows incomplete drop-down box.Oct 20 2025, 3:31 PM

Change #1199235 had a related patch set uploaded (by DCausse; author: DCausse):

[mediawiki/extensions/WikibaseLexemeCirrusSearch@master] Prefer search engine results format when matching lexeme ids

https://gerrit.wikimedia.org/r/1199235

I believe the issue is that we generally prefer EntityIdSearchHelper when matching lexeme IDs, this search helper does not have any customization for lexemes. A simple approach is to prefer the Cirrus version of the hits which contains the expected set of metadata, this might not fully solve the issue in cases where the Lexeme ID is searched rapidly after being created (searched before the search engine is updated) but this is perhaps good enough for now?

but this is perhaps good enough for now

Yeah I think so. And maybe we can have a ticket for adapting the EntityIdSearchHelper for Lexemes? (I'd do it but I don't understand enough of the details to create a meaningful ticket.)

but this is perhaps good enough for now

Yeah I think so. And maybe we can have a ticket for adapting the EntityIdSearchHelper for Lexemes? (I'd do it but I don't understand enough of the details to create a meaningful ticket.)

Sure, I'll file one as soon as this fix goes live.

Change #1199235 merged by jenkins-bot:

[mediawiki/extensions/WikibaseLexemeCirrusSearch@master] Prefer search engine results format when matching lexeme ids

https://gerrit.wikimedia.org/r/1199235

dcausse closed this task as Resolved.EditedNov 6 2025, 8:13 AM

I think this is now fixed, the behavior of items and lexemes should be the same.
The API response looks like this now (on L7 when searching for L7):

{

          "id": "L7",
          "title": "Lexeme:L7",
          "pageid": 54387119,
          "concepturi": "http://www.wikidata.org/entity/L7",
          "repository": "wikidata",
          "url": "//www.wikidata.org/wiki/Lexeme:L7",
          "display": {
                "label": {
                      "value": "cat",
                      "language": "en"
                },
                "description": {
                      "value": "English, noun",
                      "language": "en"
                }
          },
          "label": "cat",
          "description": "English, noun",
          "match": {
                "type": "entityId",
                "text": "L7"
          },
          "aliases": [
                "L7"
          ]

    }

Note that for L7 the data-issue is still present because the form L733385 is still here for L733385.
I filed T409397 as a followup to fix the same issue in EntityIdSearchHelper (which is only used when the search index is not up-to-date)