Page MenuHomePhabricator

WDQS "IndexOutOfBoundsException" in path query with MINUS
Closed, ResolvedPublic


Just seen this error for the first time. Have we recently changed the version that we're running of Blazegraph ?

The following query is failing apparently instantly, throwing a "java.lang.IndexOutOfBoundsException: Index: 2, Size: 2" (Try it!)

PREFIX wikibase: <>
PREFIX wd: <> 
PREFIX wdt: <>
PREFIX rdfs: <>
PREFIX schema: <>

    SELECT (COUNT(DISTINCT(?item)) AS ?count) WHERE {
       ?item wdt:P31/wdt:P279* wd:Q56061.
       MINUS {?item wdt:P373 ?commonscat} .


The issue appears to be specifically triggered by the path search statement.

Putting in ?item wdt:P31 wd:Q5 instead of ?item wdt:P31/wdt:P279* wd:Q56061 in the above the query runs fine ... or does until it times out, at least.

So the problem appears to be an interaction between MINUS and the path search.

Other variants of negation -- eg FILTER NOT EXISTS { ... } or OPTIONAL { ... } FILTER (!bound(...)) -- don't seem to have the problem; so it does seem probably a bug in the implementation of MINUS.

Event Timeline

Jheald raised the priority of this task from to Needs Triage.
Jheald updated the task description. (Show Details)
Jheald added a project: Wikidata-Query-Service.
Jheald added a subscriber: Jheald.
Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald Transcript
Jheald set Security to None.
Jheald updated the task description. (Show Details)
Jheald updated the task description. (Show Details)
Jheald renamed this task from WDQS "IndexOutOfBoundsException" to WDQS "IndexOutOfBoundsException" in path query with MINUS.Dec 7 2015, 10:45 AM

Looks like bug, though in this case I'm not sure MINUS is a good fit, since MINUS probably builds two solutions to calculate the diff and the second one has 1346580 items in it, potentially. Not sure if optimizer takes care of it.

Also, the backtrace looks a bit suspicious:

Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
	at java.util.ArrayList.rangeCheck(
	at java.util.ArrayList.get(
	at com.bigdata.bop.ModifiableBOpBase.get(
	at com.bigdata.rdf.sparql.ast.optimizers.ASTBottomUpOptimizer.handleMinusWithoutSharedVariables(

Why is is "without shared variables"? I'll look into it more.

Smalyshev triaged this task as Medium priority.Dec 7 2015, 9:27 PM

Also fails with wdt:P31/wdt:P279 without the asterisk, so the problem shows even with simple paths.

Looks like it is fixed on 2.0, can not reproduce anymore.