Page MenuHomePhabricator

Search of wikidata string property values using haswbstatement is case sensitive
Open, NormalPublic

Event Timeline

Mvolz created this task.Oct 10 2018, 10:40 AM
Mvolz triaged this task as Normal priority.
Restricted Application edited projects, added Discovery-Search; removed Discovery-Search (Current work). · View Herald TranscriptOct 10 2018, 10:40 AM
EBjune moved this task from needs triage to Up Next on the Discovery-Search board.Oct 11 2018, 5:04 PM

@Smalyshev what do you think? I haven't run into this myself. My feeling is that case insensitive is probably better, but would that require a lot of work?

I don't think removing case sensitivity would be a lot of manual work, but it will require a reindex to change the index. I'm not sure why we decided on it being case-sensitive, I'll try to figure it out and if there's no reason we can change it. Note that this will apply for all fields, so if there are properties where case does matter it may get things wrong.

I don't think removing case sensitivity would be a lot of manual work, but it will require a reindex to change the index. I'm not sure why we decided on it being case-sensitive, I'll try to figure it out and if there's no reason we can change it. Note that this will apply for all fields, so if there are properties where case does matter it may get things wrong.

As long as it's possible to get the original case from the API then you can remove false positives in the case sensitive case by doing another call for each result and then comparing equality. Whereas if there are no results than it's very hard to get a result as you have to try every case permutation - all lower, all upper, camel case, sentence case, title case or completely random :). That said, string values are now available from the general search which does work, so maybe there isn't a need? i.e. https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=10.1371/journal.PCBI.1002947 works.