Page MenuHomePhabricator

Odd behavior when querying Wikidata SPARQL endpoint using python requests
Closed, InvalidPublic

Description

When I make a GET request to the Wikidata SPARQL endpoint using python requests, the result takes an extraordinarily long time to complete, compared to running the same query through the browser. I figured out this was due to the requests library being unable to determine the encoding of the response, and therefore it tries to "guess" it.
Please see my example below (example 2):
https://gist.github.com/stuppie/e523bf617416e1490c25464d5a485396

If I explicitly tell requests that the encoding is utf8 (link), then the response gets parsed 100x faster and using much less RAM (example 1). I'm not sure what exactly requests is looking for in the headers, or how it should be formatted, but I just wanted to point this out to you all, because maybe there is a simple solution, which is adding something to the headers (or maybe I should be specifying something different in "Accept"?)

Event Timeline

Gstupp created this task.May 4 2017, 10:28 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMay 4 2017, 10:28 PM
Restricted Application added a project: Discovery. · View Herald TranscriptMay 4 2017, 10:29 PM

Accept looks fine, but I'm not sure what requests has trouble with... Using utf-8 with the response sounds right.

Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.May 5 2017, 11:58 AM
Smalyshev closed this task as Invalid.Jul 21 2017, 11:59 PM

Don't see anything that needs to be done by Query Service, setting correct encoding seems to fix it. Please reopen if there's anything actionable.

Restricted Application added a subscriber: PokestarFan. · View Herald TranscriptJul 21 2017, 11:59 PM