Page MenuHomePhabricator

Store wikibase statement qualifiers in cirrus search index
Closed, ResolvedPublic

Description

Write statements with their qualifiers into the cirrus search index

Each statement should be stored without qualifiers, and then separately with each qualifier

For example, on wikidata the Mona Lisa has the statement 'depicts woman' P180=Q467 and the statement 'depicts sky' P180=Q527 with the qualifiers 'applies to part background' P518=Q13217555 and 'color green' P462=Q3133.

We propose to store these in statement_keywords as follows:

'statement_keywords' : [
    'P180=Q467',
    'P180=Q527',
    'P180=Q527[P518=Q13217555]',
    'P180=Q527[P462=Q3133]',
]

In this way the user will be able to search using the haswbstatement keyword. If they want to find items with 'depicts sky' then can search for haswbstatement:P180=Q527, where if they want to find items that depict a green sky they can search for haswbstatement:P180=Q527[P462=Q3133]

Related Objects

Event Timeline

Cparle triaged this task as Normal priority.Apr 30 2018, 3:45 PM
Cparle created this task.
Cparle updated the task description. (Show Details)

Change 430078 had a related patch set uploaded (by Cparle; owner: Cparle):
[mediawiki/extensions/Wikibase@master] Write wikibase statement qualifiers to the search index

https://gerrit.wikimedia.org/r/430078

The scheme looks fine, but I have a question about which qualifiers we index. There are options:

  1. Use the same config as statement option does. The downside of this is that there may be qualifiers that we want to see that make no sense to index on statements, and vice versa.
  2. Index all qualifiers - after all, most statements only have one or two, so indexing all of them probably won't hurt. Of course, that is conditioned on the value type actually having indexing callback.
  3. Have separate config for qualifiers.

I am slightly preferring option 2, but I am not set on this.

Cparle added a comment.May 3 2018, 2:34 PM

Option 2 sounds reasonable, uploaded a new version of the patch

debt edited projects, added Discovery-Search (Current work); removed Discovery-Search.
debt moved this task from Current work to watching / waiting on the Discovery-Search board.
debt edited projects, added Discovery-Search; removed Discovery-Search (Current work).
Ramsey-WMF moved this task from Untriaged to Next up on the Multimedia board.May 7 2018, 5:00 PM

Change 430078 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Write wikibase statement qualifiers to the search index

https://gerrit.wikimedia.org/r/430078

Ladsgroup moved this task from incoming to monitoring on the Wikidata board.Jun 28 2018, 3:47 PM
Vvjjkkii renamed this task from Store wikibase statement qualifiers in cirrus search index to wydaaaaaaa.Jul 1 2018, 1:13 AM
Vvjjkkii removed Cparle as the assignee of this task.
Vvjjkkii raised the priority of this task from Normal to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
Yann renamed this task from wydaaaaaaa to Store wikibase statement qualifiers in cirrus search index.Jul 1 2018, 12:46 PM
Yann assigned this task to Cparle.
Yann lowered the priority of this task from High to Normal.
Yann updated the task description. (Show Details)
Yann added subscribers: gerritbot, Aklapper.

@EBernhardson I see you merged the patch here, but this task is still open since late June - is there further work to do? This is still in the code review column on our working board, so either we should close this task or move it into "next up" or "doing" as appropriate..

I think the next step now that it's deployed is to reindex.

OK, is the search team going to do that or should the Multimedia team keep this around as a to-do item?

Smalyshev added a comment.EditedAug 3 2018, 6:19 PM

Yes, Search Platform team will be responsible for the reindex. Note that it should already be available on recently edited items, but not on items that hasn't been edited since the patch was merged.

Smalyshev closed this task as Resolved.Sep 27 2018, 7:50 PM

Reindex is done.