Page MenuHomePhabricator

Allow alternative pagination options for LDF endpoint
Closed, ResolvedPublic

Description

Currently if you do a query of the Linked Data Fragments endpoint e.g. https://query.wikidata.org/bigdata/ldf?subject=&predicate=http%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2FP527&object= you are limited to 100 results per page. This is a problem for particularly large queries. For example, if you are trying to get a list of all of the uses of DOI (P356), there are about 13 million results, meaning getting the entire result set would require around 130,000 GET requests.

Ideally it would be possible to get the entire result set in one request, but failing that, allowing even 1000 or 10000 results per page would be a big improvement.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I don’t think getting the entire result will be possible (I think I remember problems in the caching layer with requests that keep pumping out data for a long time?), but since LDF queries are cheap, allowing a larger limit should be feasible, I assume.

Vvjjkkii renamed this task from Allow alternative pagination options for LDF endpoint to fhcaaaaaaa.Jul 1 2018, 1:08 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from fhcaaaaaaa to Allow alternative pagination options for LDF endpoint.Jul 2 2018, 3:57 PM
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.
Smalyshev triaged this task as Medium priority.Oct 12 2018, 8:55 PM

I don't think LDF request is a proper way to get the full result. Currently one page requires 27 seconds to load, so we need 53 days (if the request not run in parallel) to get all results, and the result set will change significantly in the interval.

Probably the only way to get the result set is T46581: Partial Wikidata dumps.

Currently one page requires 27 seconds to load

That's the problem that needs to be fixed.

BTracy-WMF claimed this task.
BTracy-WMF subscribed.

Closing following the deprecation of the LDF endpoint (T415696)