Page MenuHomePhabricator

Normalize SPARQL queries
Closed, ResolvedPublic

Description

As a user I want to have an option to automatically transform my SPARQL queries into a standard, consistent format (upper/lowercase, indentation, cosmetic spaces, line breaks, etc.) with the Wikidata Query UI in order to share them. As a Wikidata contributor I also want to have that option in order to apply the same format to all the queries on https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples.

Problem: There are too many ways of writing the same SPARQL query and this fact makes users write their queries using different criteria and copypasted fragments from several sources and examples. This makes a single query inconsistent with itself. On the other hand, there is no way of automatically reformatting SPARQL queries from the Wikidata Query UI and it's tedious to do it manually.

Example:

Screenshots/mockups:

BDD
GIVEN a SPARQL query
AND an option to reformat queries from the Wikidata Query UI
WHEN I click that option
THEN the SPARQL query is reformatted consistently

Suggestions:

  • All parentheses, ( and ), when they aren't part of a literal or a comment, are preceded/followed by the same combination of whitespaces/tabs/line feeds/carriage returns. Opening and closing parentheses can follow different rules.
  • All brackets, [ and ], when they aren't part of a literal or a comment, are preceded/followed by the same combination of whitespaces/tabs/line feeds/carriage returns. Opening and closing brackets can follow different rules.
  • All braces, { and }, when they aren't part of a literal or a comment, are preceded/followed by the same combination of whitespaces/tabs/line feeds/carriage returns. Opening and closing braces can follow different rules.
  • All symbols ;, when linking triples with a common subject, are preceded/followed by the same combination of whitespaces/tabs/line feeds/carriage returns.
  • All SPARQL clauses, when used as such, are uppercase.
  • There is no more than one whitespace between SPARQL clauses.
  • There are no blank lines (except, perhaps, the last one).
  • Each level of nesting introduces an indentation with the same number of whitespaces.
  • The dot symbol, ., when finishing a triple, is always followed by a line feed/carriage return.
  • Other possible rules to get a standard, consistent format without modifying the semantics.

Open questions:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

If you relax the acceptance criteria (I haven’t checked your list in detail), then there is a way to automatically reformat SPARQL queries: edit them using the query helper. Anything you do in the query helper (e. g. add a limit and then remove it again) parses and then stringifies the query, effectively normalizing it. There have also recently been efforts to improve SPARQL.js’s stringification, see #58 – we should probably update our version.

Great! Then we can simply add a new button to the left toolbar to parse and stringify the query with no semantic changes. I guess the helper isn't missing anything from the list, just the point "There are no blank lines (except, perhaps, the last one)" because it adds a blank line between the prefixes and the rest of the query, but that's completely consistent and reasonable. It also replaces the short notation ;, which is fantastic to get a standard query.

Change 462158 had a related patch set uploaded (by Abián; owner: Abián):
[wikidata/query/gui@master] Option to standardize format of SPARQL query

https://gerrit.wikimedia.org/r/462158

Change 462158 merged by jenkins-bot:
[wikidata/query/gui@master] Option to standardize format of SPARQL query

https://gerrit.wikimedia.org/r/462158

Change 462930 had a related patch set uploaded (by WDQSGuiBuilder; owner: WDQSGuiBuilder):
[wikidata/query/gui-deploy@production] Merging from 31f3f5f3478526b1eea0ae3a4e3cd47506302757:

https://gerrit.wikimedia.org/r/462930

Change 462930 merged by Smalyshev:
[wikidata/query/gui-deploy@production] Merging from 31f3f5f3478526b1eea0ae3a4e3cd47506302757:

https://gerrit.wikimedia.org/r/462930