(like Quarry but for SPARQL)
Open, LowPublic
Actions

Assigned To

None

Authored By

	Multichill
	Jul 4 2015, 8:26 AM

Description

Quarry (https://quarry.wmflabs.org/) is a web service where people can make SQL queries and share these queries and the result. It's a really nice service to get to know SQL.

Since ~2015 we have the Wikidata Query Service (https://query.wikidata.org/) which uses SPARQL. Not a lot of people know a lot of SPARQL so having some sort of service like Quarry, but for SPARQL, would make it a lot easier to use this service.

Proposed name is sparqly, but we can always bike shed over a better name.

Related Objects

Mentioned In: T358837: Allow more connections to the SPARQL query service for specific tools/users
T297995: Remove authentication from Wikimedia Commons Query Services (WCQS)
T291091: Snapshots for saved queries
T220703: Increase the max length of URL to be shortened
T209201: WDQS server/updater performance issues
T187424: Support queued SPARQL request
T112715: Enable different URL shorteners for WDQS
T149326: Dev Summit proposal: The future of the Wikidata Query Service
T128851: https://query.wikidata.org/ loads old queries instead of starting a new query
T126730: [RFC] Caching for results of wikidata Sparql queries

Event Timeline

Multichill created this task.Jul 4 2015, 8:26 AM

Multichill raised the priority of this task from to Needs Triage.

Multichill updated the task description. (Show Details)

Multichill added projects: VPS-Projects, Wikidata-Query-Service, patch-welcome.

Multichill added subscribers: Multichill, Nikki.

Restricted Application added projects: Wikidata, Discovery-ARCHIVED. · View Herald TranscriptJul 4 2015, 8:26 AM

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

I think it'll be simpler to just have Quarry handle SPARQL as well.

In T104762#1426466, @yuvipanda wrote:

I think it'll be simpler to just have Quarry handle SPARQL as well.

Fine with me too :-)

Addshore awarded a token.Jul 4 2015, 3:34 PM

Addshore subscribed.

Lydia_Pintscher moved this task from incoming to hold on the Wikidata board.Jul 4 2015, 8:42 PM

Smalyshev moved this task from Needs triage to WDQS on the Discovery-ARCHIVED board.Jul 7 2015, 8:55 PM

• Capt_Swing added a project: Quarry.Jul 17 2015, 7:31 PM

• Capt_Swing set Security to None.

Sitic subscribed.Jul 17 2015, 8:37 PM

Ricordisamoa subscribed.Jul 27 2015, 5:30 AM

Smalyshev moved this task from Incoming to GUI on the Wikidata-Query-Service board.Jan 14 2016, 7:44 AM

Restricted Application added a subscriber: StudiesWorld. · View Herald TranscriptJan 14 2016, 7:44 AM

Smalyshev mentioned this in T126730: [RFC] Caching for results of wikidata Sparql queries.Feb 17 2016, 12:29 AM

• Jonas mentioned this in T128851: https://query.wikidata.org/ loads old queries instead of starting a new query.Mar 8 2016, 8:55 PM

A stupid question by a person with bad SQL (progressed from almost zero mostly thanks to quarry) and almost no SPARQL knowledge: would it be possible to have part of a request being made in SQL and part in SPARQL and have it all output as one table? (Like select a list of articles on some wiki by some WD based criteria and then fetch its sizes, creators and stuff like this via SQL and just have it all in one table)

In T104762#2269919, @Base wrote:

A stupid question by a person with bad SQL (progressed from almost zero mostly thanks to quarry) and almost no SPARQL knowledge: would it be possible to have part of a request being made in SQL and part in SPARQL and have it all output as one table? (Like select a list of articles on some wiki by some WD based criteria and then fetch its sizes, creators and stuff like this via SQL and just have it all in one table)

Quite off topic on this bug. Another forum like the wikidata mailinglist is probably more suitable. Have a look at https://petscan.wmflabs.org/ . With that tool you can combine queries from different sources.

In T104762#2270182, @Multichill wrote:

In T104762#2269919, @Base wrote:

A stupid question by a person with bad SQL (progressed from almost zero mostly thanks to quarry) and almost no SPARQL knowledge: would it be possible to have part of a request being made in SQL and part in SPARQL and have it all output as one table? (Like select a list of articles on some wiki by some WD based criteria and then fetch its sizes, creators and stuff like this via SQL and just have it all in one table)

Quite off topic on this bug. Another forum like the wikidata mailinglist is probably more suitable. Have a look at https://petscan.wmflabs.org/ . With that tool you can combine queries from different sources.

More simply I guess that means the answer is no. Thanks, that's what I wanted to hear :)

Smalyshev triaged this task as Low priority.Sep 12 2016, 10:37 PM

With the current SPARQL setup it's easy to share queries either by full url or by short url. I think we can close this one.

Do I get it right that now a query cannot be longer than URL length limit? How much exactly is that number? I wonder if there were cases of people needing to run longer queries. Is this investigable somehow?

@Base, your questions are very interesting, and you seem to have really nice suggestions, but I would suggest a mailing list, wiki talk page (or if it was a bug/feature request, doing them on a separate ticket), as the preferred way to communicate.

This ticket is probably going to be closed soon, and when that happens your questions will get unanswered and with little visibility here.

yuvipanda mentioned this in T149326: Dev Summit proposal: The future of the Wikidata Query Service.Dec 8 2016, 6:25 PM

Base mentioned this in T112715: Enable different URL shorteners for WDQS.Dec 21 2016, 2:36 PM

Daniel_Mietchen awarded a token.Feb 8 2017, 1:07 AM

Daniel_Mietchen subscribed.

• Phabricator_maintenance removed a subscriber: yuvipanda.Jun 7 2017, 6:49 PM

WikidataFacts subscribed.Jul 11 2017, 6:37 PM

Liuxinyu970226 subscribed.Jul 13 2017, 5:54 AM

In T104762#2635939, @Multichill wrote:

With the current SPARQL setup it's easy to share queries either by full url or by short url. I think we can close this one.

I disagree: one important part of this task, saving results, isn’t served at all by this. We want to be able to save query results and share them, and unlike on Quarry, it shouldn’t be possible to change those results later, even for the query author (who, on Quarry, can re-run the query, changing the results without assigning a new ID). Other than when privacy or legal concerns require the results to be deleted, the pages should be immutable.

This should be an optional component, not the main interface for querying (as Quarry is for the SQL databases) – WDQS sees millions of queries every day (the exact number varies with each new Wikidata presentation), we can’t afford to save all those results.

We want to be able to save query results and share them

Tabular data on Commons should be a good place for it, not? Do we need yet another place/way to store tabular data?

zhuyifei1999 subscribed.Dec 2 2017, 11:06 PM

In T104762#3806240, @Smalyshev wrote:

Tabular data on Commons should be a good place for it, not? Do we need yet another place/way to store tabular data?

That seems like a good option indeed. In that case, we'd need a way to pull the data back into the WDQS for visualization.

IIRC https://www.mediawiki.org/wiki/Extension:Graph can work with tabular data. WDQS GUI can't export into graphs though, except for Graph Builder, so there's some improvement possible there.

I don’t think that’s a good fit. Query results aren’t necessarily notable for Commons, nor are they necessarily pure data (e. g. labels and descriptions, image links, or constructed result columns – the most extreme example would be the “cocktail recipes” query). Commons’ Tabular Data also imposes some restrictions which not all query results fulfill (e. g. strings cannot be longer than 400 characters), and unless we store tiny JSON blobs like {"type": "literal", "value": "foo"} inside the string values in the tabular data (storing objects directly is not allowed), we also lose some information about the data (the distinction between literals and IRIs).

Bugreporter mentioned this in T187424: Support queued SPARQL request.Feb 15 2018, 5:26 AM

Not exactly Quarry, but see https://commons.wikimedia.org/wiki/User:TabulistBot - this should be similar to Listeria and generate persistent reusable tabular data.

@Lucas_Werkmeister_WMDE I agree there are some downsides to this model, but I think it's the easiest and most natural to do for now despite the limitations, so I'd like to see if it can work with it.

Addshore mentioned this in T209201: WDQS server/updater performance issues.Nov 16 2018, 3:46 PM

Smalyshev added a project: Epic.Jun 21 2019, 4:51 AM

Jheald mentioned this in T220703: Increase the max length of URL to be shortened.Aug 1 2019, 3:01 PM

Bugreporter merged a task: T211130: Permanent link to a Wikidata query result.Aug 4 2019, 3:22 PM

Bugreporter added subscribers: Wittylama, abian.

Ayack subscribed.Nov 4 2019, 8:39 AM

Mmarx subscribed.Jan 6 2020, 1:51 PM

Sylvain_WMFr subscribed.Aug 3 2020, 4:04 PM

Bugreporter mentioned this in T291091: Snapshots for saved queries.Sep 15 2021, 4:06 PM

So9q subscribed.Jan 28 2022, 10:13 AM

In T104762#3805254, @Lucas_Werkmeister_WMDE wrote:

In T104762#2635939, @Multichill wrote:

With the current SPARQL setup it's easy to share queries either by full url or by short url. I think we can close this one.

I disagree: one important part of this task, saving results, isn’t served at all by this. We want to be able to save query results and share them, and unlike on Quarry, it shouldn’t be possible to change those results later, even for the query author (who, on Quarry, can re-run the query, changing the results without assigning a new ID). Other than when privacy or legal concerns require the results to be deleted, the pages should be immutable.

valerio.bozzolan awarded a token.Aug 30 2022, 7:57 AM

valerio.bozzolan renamed this task from Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry) to Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry but for SPARQL).Aug 30 2022, 8:02 AM

valerio.bozzolan updated the task description. (Show Details)

The killer feature of this tool would be: less timeouts.

Lot of users have very interesting queries, that sometime cannot just be optimized more and just require more resources. Having said I understand we cannot just increase resources for every anonymous execution in the world (lot of people could abuse it) I think it could be possible to use this Task to target a fix for this issue, since this tool will have a queue, and there is no parallel execution.

Addshore unsubscribed.Jun 27 2023, 12:33 PM

Bugreporter mentioned this in T297995: Remove authentication from Wikimedia Commons Query Services (WCQS).Sep 6 2023, 2:46 PM

Gehel mentioned this in T358837: Allow more connections to the SPARQL query service for specific tools/users.Apr 15 2024, 1:45 PM

Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry but for SPARQL)Open, LowPublicActions

Description

Related Objects

Event Timeline

Setup sparqly service at https://sparqly.wmflabs.org/ (like Quarry but for SPARQL)
Open, LowPublic
Actions