Page MenuHomePhabricator

SPARQL queries using VALUES time out
Open, MediumPublic

Description

I notice that queries using values like

SELECT ?item ?item2 ?inv WHERE {
  VALUES ?collection {wd:Q28045665 wd:Q28045660 wd:Q28045674 wd:Q2066737 wd:Q18600731 } . #  The different collections
  ?item p:P217 ?inv1statement .
  ?inv1statement ps:P217 ?inv .
  ?inv1statement pq:P195 ?collection . 
  ?item2 p:P217 ?inv2statement .
  ?inv2statement ps:P217 ?inv .
  ?inv2statement pq:P195 ?collection . 
  FILTER(?item!=?item2)

  } LIMIT 500

often time out.

if you unpack it in a bunch of unions it completes in less than a second:

SELECT ?item ?item2 ?inv WHERE {
{ ?inv1statement pq:P195 wd:Q28045665 . } union
{ ?inv1statement pq:P195 wd:Q28045660 . } union
{ ?inv1statement pq:P195 wd:Q28045674 . } union
{ ?inv1statement pq:P195 wd:Q2066737 . } union
{ ?inv1statement pq:P195 wd:Q18600731 . }

  ?item p:P217 ?inv1statement .
  ?inv1statement ps:P217 ?inv .

{ ?inv2statement pq:P195 wd:Q28045665 . } union
{ ?inv2tatement pq:P195 wd:Q28045660 . } union
{ ?inv2statement pq:P195 wd:Q28045674 . } union
{ ?inv2statement pq:P195 wd:Q2066737 . } union
{ ?inv2statement pq:P195 wd:Q18600731 . }

  ?item2 p:P217 ?inv2statement .
  ?inv2statement ps:P217 ?inv .
  FILTER(?item!=?item2)

  } LIMIT 500

This is something the optimizer should handle. Optimizer output at https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=SELECT%20%3Fitem%20%3Fitem2%20%3Finv%20WHERE%20{%0A%20%20VALUES%20%3Fcollection%20{wd%3AQ28045665%20wd%3AQ28045660%20wd%3AQ28045674%20wd%3AQ2066737%20wd%3AQ18600731%20}%20.%20%23%20%20The%20different%20collections%0A%20%20%3Fitem%20p%3AP217%20%3Finv1statement%20.%0A%20%20%3Finv1statement%20ps%3AP217%20%3Finv%20.%0A%20%20%3Finv1statement%20pq%3AP195%20%3Fcollection%20.%20%0A%20%20%3Fitem2%20p%3AP217%20%3Finv2statement%20.%0A%20%20%3Finv2statement%20ps%3AP217%20%3Finv%20.%0A%20%20%3Finv2statement%20pq%3AP195%20%3Fcollection%20.%20%0A%20%20FILTER(%3Fitem!%3D%3Fitem2)%0A%0A%20%20}%20LIMIT%20500%0A

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Yeah, I notice that too. workaround: use UNION ;)

Smalyshev triaged this task as Medium priority.Feb 15 2017, 1:52 AM

Recommended way to rewrite it:

SELECT ?item ?item2 ?inv WHERE {
  
  hint:Query hint:optimizer "None" .
  
  ?inv1statement pq:P195 ?collection . 
  ?inv1statement ps:P217 ?inv .
  ?inv2statement ps:P217 ?inv .
  ?item p:P217 ?inv1statement .
  ?item2 p:P217 ?inv2statement .
  FILTER(?item!=?item2)
  ?inv2statement pq:P195 ?collection . 

} LIMIT 500
VALUES ?collection {wd:Q28045665 wd:Q28045660 wd:Q28045674 wd:Q2066737 wd:Q18600731 }

Another query that is broken:

SELECT ?item ?sitelink WHERE {
#  hint:Query hint:optimizer "None".
  VALUES ?item { wd:Q1 wd:Q2 }
  ?sitelink schema:about ?item .
  ?sitelink schema:isPartOf/wikibase:wikiGroup "wikipedia" .
}