Page MenuHomePhabricator

query causes Stack Overflow
Closed, DuplicatePublicBUG REPORT

Description

The following query https://w.wiki/7Tep counts prevalence of location properties for Rivers:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
select
  (count(*) as ?rivers)
  (sum(?country          ) as ?country          )
  (sum(?continent        ) as ?continent        )
  (sum(?adminEntity      ) as ?adminEntity      )
  (sum(?location         ) as ?location         )
  (sum(?physFeature      ) as ?physFeature      )
  (sum(?significantPlace ) as ?significantPlace )
  (sum(?statisticalEntity) as ?statisticalEntity)
{
  {select distinct ?river {?river wdt:P31/wdt:P279* wd:Q4022}} # instance of any subclass of: river
  bind(if(exists{?river wdt:P17   []},1,0) as ?country          )
  bind(if(exists{?river wdt:P30   []},1,0) as ?continent        )
  bind(if(exists{?river wdt:P131  []},1,0) as ?adminEntity      )
  bind(if(exists{?river wdt:P276  []},1,0) as ?location         )
  bind(if(exists{?river wdt:P706  []},1,0) as ?physFeature      )
  bind(if(exists{?river wdt:P7153 []},1,0) as ?significantPlace )
  bind(if(exists{?river wdt:P8138 []},1,0) as ?statisticalEntity)
}

It causes Stack Overflow in Blazegraph. The stack trace is too long to cite (duh) but it seems the culprit is:

com.bigdata.rdf.sparql.ast.StaticAnalysis.collectVarsFromExpressions(StaticAnalysis.java:2101)

The same query runs ok on Ontotext GraphDB at http://162.55.95.184:7400/sparql (this is an up to date copy, but only of wdtruthy):

image.png (303×1 px, 60 KB)

Event Timeline

You need to change your query so it doesn’t use the same variable name before and after grouping. (sum(?country) as ?country) is not allowed according to the spec; I suggest renaming the variables with underscores, like this:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
select
  (count(*) as ?rivers)
  (sum(?country_          ) as ?country          )
  (sum(?continent_        ) as ?continent        )
  (sum(?adminEntity_      ) as ?adminEntity      )
  (sum(?location_         ) as ?location         )
  (sum(?physFeature_      ) as ?physFeature      )
  (sum(?significantPlace_ ) as ?significantPlace )
  (sum(?statisticalEntity_) as ?statisticalEntity)
{
  {select distinct ?river {?river wdt:P31/wdt:P279* wd:Q4022}} # instance of any subclass of: river
  bind(if(exists{?river wdt:P17   []},1,0) as ?country_          )
  bind(if(exists{?river wdt:P30   []},1,0) as ?continent_        )
  bind(if(exists{?river wdt:P131  []},1,0) as ?adminEntity_      )
  bind(if(exists{?river wdt:P276  []},1,0) as ?location_         )
  bind(if(exists{?river wdt:P706  []},1,0) as ?physFeature_      )
  bind(if(exists{?river wdt:P7153 []},1,0) as ?significantPlace_ )
  bind(if(exists{?river wdt:P8138 []},1,0) as ?statisticalEntity_)
}

Reading through the spec:

  • https://www.w3.org/TR/sparql11-query/#bind: "The variable introduced by the BIND clause must not have been used in the group graph pattern up to the point of use in BIND."
  • https://www.w3.org/TR/sparql11-query/#selectExpressions: "The rules of assignment in SELECT expression are the same as for assignment in BIND. The expression combines variable bindings already in the query solution, or defined earlier in the SELECT clause. The variable may be used in an expression later in the same SELECT clause and may not be be assigned again in the same SELECT clause."
    • This says you can't "assign" the same var twice in SELECT and that vars are brought forward from BIND, but not explicitly that you can't "reassign" from BIND to SELECT

So I'd say it's 99% clear that is forbidden

But in any case Blazegraph shouldn't cause a stack overflow?