Page MenuHomePhabricator

Querying the URL datatype with haswbstatement
Closed, ResolvedPublic

Description

haswbstatement currently supports datatypes such as external identifier and string but not the URL datatype.

It would be very useful if one could search for urls in the following manner:
https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=haswbstatement:P856=http://www.nationalmuseum.se/

Event Timeline

Abbe98 created this task.Jan 25 2020, 7:06 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 25 2020, 7:06 PM

For wikidata properties types they need to be explicitly whitelisted. I poked through our data and the field that currently holds this data has ~241M unique values. Adding the url's should only increase it by 1-2M, so probably fine. Or not entirely fine, these 241M unique values have to stay resident in memory multiple times, but adding 1% isn't going to change anything.

For actually rolling this out, it will be between 2 and 3 months after the config change is shipped for the wikidata index to be fully populated with urls

Change 577312 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[operations/mediawiki-config@master] Whitelist urls for inclusion in wikidata statements indexed to search

https://gerrit.wikimedia.org/r/577312

Change 577312 merged by jenkins-bot:
[operations/mediawiki-config@master] Whitelist urls for inclusion in wikidata statements indexed to search

https://gerrit.wikimedia.org/r/577312

Forced a reindex on Q842858, it is now returned by the example query. As mentioned it will be two to three months before this has the urls indexed for all wikidata items.

Abbe98 closed this task as Resolved.Mar 6 2020, 4:18 PM
Abbe98 claimed this task.
Abbe98 removed Abbe98 as the assignee of this task.