Page MenuHomePhabricator

Implement match for any-language label (haslabel:*)
Closed, ResolvedPublic


From the user request:

It is currently not possible to evaluate all items without labels. Currently, through the use of haslabel like haslabel:en one can see all labels written in specific language.

Use case

As a user, I'd like to only add labels to items that don't have any at all.

As a user, I'd like to see the total number of items that do have labels.

As a user, for tools like Extension:WikibaseMediaInfo, I'd like to filter only items within a category without labels.

It seems like a natural extension of the current functionality.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 29 2019, 6:07 PM
EBernhardson triaged this task as Medium priority.May 30 2019, 3:59 PM
EBernhardson moved this task from needs triage to Wikidata Search on the Discovery-Search board.
EBernhardson added a subscriber: EBernhardson.

seems reasonable enough, should be relatively easy to implement.

Matching via individual languages might create pretty big query though... We could match against labels_all, but that would not work for descriptions.

Certainly matching an _all field is the only thing reasonably performant here. We could create a descriptions_all if that's needed.

Change 514415 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/WikibaseCirrusSearch@master] Implement haslabel:*

Smalyshev claimed this task.Jun 5 2019, 1:33 AM
Smalyshev moved this task from Incoming to Needs review on the Discovery-Search (Current work) board.
Smalyshev added a project: User-Smalyshev.
Smalyshev moved this task from Backlog to In review on the User-Smalyshev board.

Change 514415 merged by jenkins-bot:
[mediawiki/extensions/WikibaseCirrusSearch@master] Implement haslabel:*

Smalyshev moved this task from In review to Done on the User-Smalyshev board.Jun 7 2019, 1:08 AM
debt closed this task as Resolved.Jun 21 2019, 2:19 PM
Smalyshev reopened this task as Open.Jun 26 2019, 6:12 AM

Somehow doesn't seem to work...

The query:*&title=Special%3ASearch&go=Go&ns0=1&ns6=1&ns12=1&ns14=1&ns100=1&ns106=1&cirrusDumpQuery=yes

seems to be ok but no results. Negative search works though. I wonder what could be the reason?

Interestingly enough, it seems to work on Wikidata but not Commons. I wonder why?

dcausse added a comment.EditedJun 26 2019, 6:56 AM

I have no clue why it's not working yet but I think youwe should have indexed the captions to description fields not label fields. Label fields are optimized for exact matches which is not suited for captions.

Right now from what I understand we're indexing them as label fields since they are recorded as labels. Should this change? Data model lists MediaInfo as having both labels and descriptions, though I am not sure what this means.

@Cparle could you shed some light on this?

Yeah, I think they're being indexed as label fields, and also written into opening_text. See,_England.jpg?action=cirrusDump

Created T226722 to fix this.

Smalyshev renamed this task from Consider adding haslabel:all to Implement match for any-language label (haslabel:*).Jun 27 2019, 8:42 PM

Change 519514 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/WikibaseCirrusSearch@master] Fix haslabel:* to use labels_all.plain field

Change 519514 merged by jenkins-bot:
[mediawiki/extensions/WikibaseCirrusSearch@master] Fix haslabel:* to use labels_all.plain field

debt closed this task as Resolved.Jul 3 2019, 4:23 PM

Works on test.wikidata, needs to be verified on main production after T227136: Reindexing search index wikidatawiki for eqiad fails is fixed.