Page MenuHomePhabricator

Index all statements (without value) for all datatypes for haswbstatement
Closed, ResolvedPublicBUG REPORT

Description

User should be able to use haswbstatement:Pxxx to find all entities with Pxxx statement, regardless which datatype Pxxx is. Currently only some datatypes are indexed.

Original:

https://commons.wikimedia.org/w/index.php?search=haswbstatement:P1259

There were no results matching the query.

What should have happened instead?:

finds files with p1259 = any value.

see https://commons.wikimedia.org/wiki/Commons_talk:Structured_data#c-Bjh21-20240806173700-RZuo-20240806115800 for reference.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This is because P1259 has coord datatype which is not indexed in CirrusSearch.

Bugreporter renamed this task from no results matching the query haswbstatement:P1259 to Index all statements (without value) for all datatypes for haswbstatement.Aug 6 2024, 11:58 PM
Gehel added subscribers: AUgolnikova-WMF, Gehel.

Tagging @AUgolnikova-WMF as product manager on SDoC so that she can prioritize this.

The implementation would need some analysis to ensure that we don't overload the search indices with high cardinality. This might be something to raise through the community wishlist.

For the cardinality bits, i did some analysis and found that P<n> values would only be ~20k items which is <1% of the current cardinality. Once deployed this will still take some time to be fully active, it has to go through a full rerender of wikidata which happens on a 16 week cycle. Deploying this week means that the feature will be fully available in the first week of January, but will start displaying partial results immediatly.