Page MenuHomePhabricator

[Bug] SPARQL sub queries avoid timeouts
Open, MediumPublic

Description

This query works

SELECT DISTINCT ?id ?label ?description WHERE {
  {
    SELECT ?id WHERE { ?i wdt:P31 ?id. }
  }
  ?id rdfs:label ?label.
  ?id schema:description ?description.
  FILTER((LANG(?label)) = "en")
  FILTER((LANG(?description)) = "en")
}
LIMIT 20

but this query times out

SELECT DISTINCT ?id ?label ?description WHERE {
  ?i wdt:P31 ?id. 
  ?id rdfs:label ?label.
  ?id schema:description ?description.
  FILTER((LANG(?label)) = "en")
  FILTER((LANG(?description)) = "en")
}
LIMIT 20

So as far as I understand those two queries are semantically the same and it might be a bug that a sub query must be used to avoid time outs.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

They are not exactly equivalent - subselect forces the evaluation to run the subquery first, since queries in SPARQL are evaluated inside-out. It may influence decisions of the optimizer.

Smalyshev triaged this task as Medium priority.Feb 15 2017, 1:55 AM