Page MenuHomePhabricator

SPARQL query causes StackOverflowError and fails to execute
Open, Needs TriagePublic

Description

I have a large-ish query that I've been regularly using to find information about a person based on their Wikipedia URL:

SELECT DISTINCT
  ?item
  ?personLabel
  ?personDescription
  ?image
  ?birthLabel
  ?deathLabel
  ?website
  (GROUP_CONCAT(DISTINCT ?alias; SEPARATOR = ",") AS ?aliases)
  (GROUP_CONCAT(DISTINCT ?countryCode; SEPARATOR = ",") AS ?citizenships)
  (GROUP_CONCAT(DISTINCT ?twitterId; SEPARATOR = ",") AS ?twitterId)
WHERE {
  <https://en.wikipedia.org/wiki/Hans_Zimmer> schema:about ?person.
  OPTIONAL { ?person wdt:P18 ?image. }
  OPTIONAL { ?person wdt:P569 ?birth. }
  OPTIONAL { ?person wdt:P570 ?death. }
  OPTIONAL { ?person wdt:P856 ?website. }
  ?person wdt:P27 ?citizenship.
  ?citizenship wdt:P17 ?country.
  ?country wdt:P297 ?countryCode.
  OPTIONAL { ?person skos:altLabel ?alias. FILTER (lang(?alias) = "en") }
  OPTIONAL { ?person wdt:P2002 ?twitterId. }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
} GROUP BY
  ?item
  ?image
  ?birthLabel
  ?personLabel
  ?personDescription
  ?deathLabel
  ?website

(click here to run it in the web query service)

This query was working fine until sometime yesterday, Monday 14th October.

This query now throws an exception, which can be viewed in the query service's web frontend, which is originally caused by a StackOverflow:

java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: java.lang.StackOverflowError
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:206)
	at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:292)
	at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:678)
	at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:290)
	at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240)
	at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:273)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:865)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1655)
	at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:354)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
	at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
	at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1642)
	at org.wikidata.query.rdf.blazegraph.filters.RealAgentFilter.doFilter(RealAgentFilter.java:33)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1340)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1242)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
	at org.eclipse.jetty.server.Server.handle(Server.java:503)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
	at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.ExecutionException: java.lang.StackOverflowError
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
	at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:889)
	at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:695)
	at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	... 1 more
Caused by: java.lang.StackOverflowError
	at com.bigdata.rdf.sparql.ast.StaticAnalysis.getVarsFromArguments(StaticAnalysis.java:2113)
	at com.bigdata.rdf.sparql.ast.StaticAnalysis.collectVarsFromExpressions(StaticAnalysis.java:2099)
	at com.bigdata.rdf.sparql.ast.StaticAnalysis.collectVarsFromExpressions(StaticAnalysis.java:2101)
	at com.bigdata.rdf.sparql.ast.StaticAnalysis.collectVarsFromExpressions(StaticAnalysis.java:2101)
	at com.bigdata.rdf.sparql.ast.StaticAnalysis.collectVarsFromExpressions(StaticAnalysis.java:2101)
        [...the above line repeated another 1019 times...]

(I've shortened the end of the stacktrace to save effort when scrolling)

It looks like perhaps a recent change to the SPARQL parser has introduced a bug?

Interestingly, deleting the line (GROUP_CONCAT(DISTINCT ?twitterId; SEPARATOR = ",") AS ?twitterId) stops the exception from occurring and the query runs without issue, but I'm not sure what is different about that line from the two above that would be causing this...

Edit: As @Lucas_Werkmeister_WMDE pointed out in a comment below, this is caused by the line (GROUP_CONCAT(DISTINCT ?twitterId; SEPARATOR = ",") AS ?twitterId) reusing the variable ?twitterId (for both the input and output of the GROUP_CONCAT, which the other two GROUP_CONCATs don't do). A workaround for now is to rename the output variable to something like ?twitterId_, which avoids the problem.

Let me know if there's any further information I can provide - I'd like to help get this fixed as soon as possible :)

Event Timeline

Restricted Application added a project: Wikidata. · View Herald TranscriptOct 15 2019, 5:28 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Possibly related to @Igorkim78’s work in T168876: MWAPI service throws “could not find binding for parameter” if optimizer is not disabled? I306907e175 mentions that class, at least. (The upstream version of StaticAnalysis doesn’t have a collectVarsFromExpressions method, by the way, so I’m not sure what version of the class we’re using and what repository it’s in.)

@Evilricepuddin workaround: in

(GROUP_CONCAT(DISTINCT ?twitterId; SEPARATOR = ",") AS ?twitterId)

change the final ?twitterId to ?twitterId_.

But reusing the same variable name after aggregation is something I’ve also often done in the past, so ideally we should get this fixed.

Thanks for the workaround, can confirm that the query is working again for me now :)

Evilricepuddin updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2019-10-16T11:56:26Z] <onimisionipe@deploy1001> Started deploy [wdqs/wdqs@c90503b]: Revert to fix T235540

This comment was removed by Mathew.onipe.

Mentioned in SAL (#wikimedia-operations) [2019-10-16T12:15:36Z] <onimisionipe@deploy1001> Finished deploy [wdqs/wdqs@c90503b]: Revert to fix T235540 (duration: 19m 09s)

Mentioned in SAL (#wikimedia-operations) [2019-10-16T12:26:56Z] <onimisionipe@deploy1001> Started deploy [wdqs/wdqs@217cac5]: redeploy 0.3.4-SNAPSHOT - T235540

Mentioned in SAL (#wikimedia-operations) [2019-10-16T12:30:28Z] <onimisionipe@deploy1001> Finished deploy [wdqs/wdqs@217cac5]: redeploy 0.3.4-SNAPSHOT - T235540 (duration: 03m 40s)

The LabelService optimizer was fixed (so it will not throw NPEs) this August, by reusing Blazegraph core utility com.bigdata.rdf.sparql.ast.StaticAnalysis.getVarsFromArguments(BOp) to run an introspection on variables used in filters and other clauses, so LabelService call placement could be properly adjusted, this introspection seems to come into infinite loop over the AST tree. Vars reuse to label aggregation after the original var is a common practice, so, yes it should be fixed. Looking on the workaround to extract referenced vars without catching into the infinite loop.

Envlh added a subscriber: Envlh.Sat, Oct 19, 9:22 AM

Fix by all means, but I note "It is an error for aggregates to project variables with a name already used in other aggregate projections, or in the WHERE clause." https://www.w3.org/TR/sparql11-query/#aggregateExample

That’s true, but Virtuoso also allows this, at least in the DBpedia endpoint.

As a user, I'd be happy with the fix being that the variable re-use is detected and I'm given an appropriate error message (this was, after all, the fix that I deployed to work around the problem). I imagine that the StackOverflowError is not the intended "you've sent an invalid query" error. ;)

TomT0m added a subscriber: TomT0m.Wed, Oct 23, 9:22 AM

I found this bug only after reducing my broken query until only this was left:

SELECT (MIN(?name) AS ?name) WHERE {
}

Also, I noticed COUNT and SAMPLE work fine, while the other aggregate functions do not support this aliasing. If it is an error, a better error message would definitely come handy.

Ainali added a subscriber: Ainali.Mon, Nov 4, 9:28 PM
TuukkaH added a subscriber: TuukkaH.Fri, Nov 8, 7:05 AM