Page MenuHomePhabricator

Query optimizer pessimises queries
Closed, ResolvedPublic

Description

Blazegraph's query optimizer orders operations in a way that leads to a timeout in some cases where disabling the optimizer returns a result within seconds.

The query below currently works, though it had the problem before:

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wikibase: <http://wikiba.se/ontology#>

SELECT DISTINCT ?result WHERE {
	{ 
		{ ?subject0 rdfs:label "United States"@en . } UNION { ?subject0 skos:altLabel "United States"@en . }
	}
	{
		{ ?predicate1 rdfs:label "president"@en . } UNION { ?predicate1 skos:altLabel "president"@en . }
	}
	?predicate1 a wikibase:Property .
	?predicate1 wikibase:directClaim ?directPredicate2 .
	?subject0 ?directPredicate2 ?result .
}

This url helped us identify what problem caused the timeout. Blazegraph explain feature

The query works with optimizer disabled:

PREFIX hint: <http://www.bigdata.com/queryHints#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wikibase: <http://wikiba.se/ontology#>

SELECT DISTINCT ?result WHERE {
	hint:Query hint:optimizer "None" .

	{ 
		{ ?subject0 rdfs:label "United States"@en . } UNION { ?subject0 skos:altLabel "United States"@en . }
	}
	{
		{ ?predicate1 rdfs:label "president"@en . } UNION { ?predicate1 skos:altLabel "president"@en . }
	}
	?predicate1 a wikibase:Property .
	?predicate1 wikibase:directClaim ?directPredicate2 .
	?subject0 ?directPredicate2 ?result .
}

Event Timeline

Bene raised the priority of this task from to Needs Triage.
Bene updated the task description. (Show Details)
Bene added subscribers: Bene, JanZerebecki, Lucie and 3 others.
Bene triaged this task as Medium priority.Jul 29 2015, 1:15 PM

Some more Blazegraph query optimizer issues presented at

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/query_optimization and

https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/suggestions#Query_optimizer_issues

Interestingly the original case with the two UNION blocks appears now to work. However there are other conditions where the optimizer may not succeed, including a variant of the above posted by Smalyshev, and UNION blocks which contain BIND statements.

Smalyshev claimed this task.

Seems to be working fine now. If there are other problematic queries, please submit new report.