Page MenuHomePhabricator

Wikidata query times out
Closed, InvalidPublic

Description

when i get this query query wikidata will be crash but by changing the country it will be ok like this. in my opinion it is cased the huge number of results.

Event Timeline

Yamaha5 created this task.Mar 17 2018, 5:02 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 17 2018, 5:02 PM

it shows time out

java.util.concurrent.TimeoutException
	at java.util.concurrent.FutureTask.get(FutureTask.java:205)
	at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:293)
	at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:654)
	at com.bigdata.rdf.sail.webapp.QueryServlet.doGet(QueryServlet.java:288)
	at com.bigdata.rdf.sail.webapp.RESTServlet.doGet(RESTServlet.java:240)
	at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doGet(MultiTenancyServlet.java:271)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:687)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:769)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1667)
	at org.wikidata.query.rdf.blazegraph.throttling.ThrottlingFilter.doFilter(ThrottlingFilter.java:304)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
	at ch.qos.logback.classic.helpers.MDCInsertingServletFilter.doFilter(MDCInsertingServletFilter.java:49)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
	at org.wikidata.query.rdf.blazegraph.filters.ClientIPFilter.doFilter(ClientIPFilter.java:43)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1650)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1125)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1059)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
	at org.eclipse.jetty.server.Server.handle(Server.java:497)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:248)
	at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:610)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:539)
	at java.lang.Thread.run(Thread.java:748)
Yamaha5 renamed this task from query wikidata crash to query wikidata crash and shows time out.Mar 17 2018, 5:08 PM
Restricted Application added projects: Wikidata, Discovery. · View Herald TranscriptMar 17 2018, 6:14 PM

There is too much data returned for the query service to handle within the timelimit.
A little optimization for your query:

PREFIX schema: <http://schema.org/>
SELECT ?item ?enarticle ?linkcount ?itemLabel WHERE { {
SELECT ?item ?enarticle (COUNT(DISTINCT ?sitelink) AS ?linkcount) WHERE {
  ?item (wdt:P31/wdt:P279) wd:Q486972.
  ?sitelink schema:about ?item.
  ?item wdt:P17 wd:Q668.
  OPTIONAL {
    ?enarticle schema:about ?item.
    ?enarticle schema:isPartOf <https://en.wikipedia.org/>.
  }
  MINUS {
    ?article schema:about ?item.
    ?article schema:inLanguage "fa".
    ?article schema:isPartOf <https://fa.wikipedia.org/>.
  }
}
GROUP BY ?item ?itemLabel ?enarticle
ORDER BY DESC(?linkcount)
}
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,fa". }
}

By moving the label service outside the main query, it first matches the main query and then it fetches the labels for the results. Otherwise it fetches the label for all items before filtering them.
This isn't an error, but just a limitation of the query service to prevent 1 query to run that long that other queries have to wait in the queue.

Ladsgroup closed this task as Invalid.Mar 17 2018, 11:56 PM
Ladsgroup added a subscriber: Ladsgroup.

I talked to the user, cities in India is too big to run such queries which ends up as time outs. This is more of a symptom of bad indexing in the blazegraph than a problem on its own.

Smalyshev renamed this task from query wikidata crash and shows time out to Wikidata query times out.Mar 18 2018, 2:36 AM