Page MenuHomePhabricator

[QueryService]Format query clears out Prefixes necessary for wikibase.cloud
Closed, ResolvedPublicBUG REPORT

Description

Steps to replicate the issue (include links if applicable):

  • Enter a query with prefixes (such as this one
  • Click the Format query button

What happens?:

  • The necessary two PREFIX lines are deleted

What should have happened instead?:
PREFIX lines should have been kept unless matching pre-defined prefixes (i.e. ones already assumed by the query service)

Software version (skip for WMF-hosted wikis like Wikipedia):
Wikibase.cloud

Other information (browser name/version, screenshots, etc.):
Note that PREFIXES are necessary for wikibase.cloud instances, as compared to the Wikidata Query service

Event Timeline

Fring removed Fring as the assignee of this task.Jan 3 2024, 3:37 PM
Fring moved this task from Doing to In Review on the Wikibase Cloud (Kanban board Q4 2023) board.
Fring subscribed.

I think this is heading towards the essence of the desired behaviour but is still a little distance from it. I believe the desired behaviour is as follows using a combination of what I subjectively think a clean query should look like but also trying to read the non-functional code and interpret the part of it that are there:

  1. Remove all unused prefix definitions
  2. Remove all prefix definitions that are part of the "standard prefixes"
  3. For any specified used prefix definition ensure that there are no full (i.e. not prefixed) URIs referenced in the query by replacing full URIs with the prefixed form
  4. Neatly place all prefixes on at the top of the query one per line

Note the the previous logic was designed to take into account that prefixes (and sparql in general) can but don't need new lines. What I believe is a somewhat contrived example of the "old" logic behaving sort of as intended (but still mysteriously failing to keep the used prefix) is this:

< PREFIX wdt: <http://wikidata.org/entity/>
< PREFIX examplea: <https://www.example.com/a/>
< 
< SELECT ?baz WHERE {
<    <https://www.example.com/a/foo> <https://www.example.com/b/bar> ?baz
<  }
< 
---
> SELECT ?baz WHERE { examplea:foo <https://www.example.com/b/bar> ?baz. }

To quickly describe how https://github.com/wbstack/queryservice-ui/pull/237 currently adjusts the the example query from the user we can see it looking like this:

-PREFIX wb: <https://metabase.wikibase.cloud/entity/>
PREFIX wbt: <https://metabase.wikibase.cloud/prop/direct/>
SELECT DISTINCT ?item1 ?item1Label ?item2 ?item2Label ?value WHERE {
  ?item1 wbt:P1 ?value.
  ?item2 wbt:P1 ?value.
  FILTER((?item1 != ?item2) && ((STR(?item1)) < (STR(?item2))))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "sv". }
}

Whereas pre-change to the wbstack fork (and also on wikidata) it behaves as follows:

-PREFIX wb: <https://metabase.wikibase.cloud/entity/>
-PREFIX wbt: <https://metabase.wikibase.cloud/prop/direct/>
SELECT DISTINCT ?item1 ?item1Label ?item2 ?item2Label ?value WHERE {
  ?item1 wbt:P1 ?value.
  ?item2 wbt:P1 ?value.
  FILTER((?item1 != ?item2) && ((STR(?item1)) < (STR(?item2))))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "sv". }
}

Which I think is a good way towards doing what was actually intended by the original code. Of course I think the spec for this may be lost to the sands of time. I notice @Lydia_Pintscher is looking at this and you could maybe say if this looks like what the intended behaviour is.

We intend to get this working for our "fork" but of course it would make sense to upstream this since the behaviour should be common to other wikibases like wikidata. (although with the caveat that "standard prefixes" may differ between wikibases)

Tarrow subscribed.

For any specified used prefix definition ensure that there are no full (i.e. not prefixed) URIs referenced in the query by replacing full URIs with the prefixed form

I don't understand what this is saying. Could you maybe provide an example for this behavior?

I don't understand what this is saying. Could you maybe provide an example for this behavior?

Certainly! I actually tried to do this with the first example from the status quo.

In this case there was a PREFIX defined at the top of the file for examplea (PREFIX examplea: <https://www.example.com/a/>).

In the body of the query there was also a URI that was prefixed by this examplea prefix; it was the subject of the first statement. (<https://www.example.com/a/foo>).

After running the clean algorithm this full URI was replaced by the prefixed form examplea:foo

@Ifrahkhanyaree_WMDE Hey! In the LOD chat it was claimed that this also might really be a thing in your court so I thought I'd ping you here for some feedback too.

We (I) am slightly concerned that by reaching outside the cloud team this apparent "quick win" might stretch to a huge mission that never gets completed so if you or @Lydia_Pintscher know that you won't be able to think about this in the next day or so it would be cool to know and we'll just act unilaterally on our fork. Thanks!

Fring changed the task status from Open to Stalled.Jan 11 2024, 11:11 AM

We're not dealing with anything related to the queryservice-ui now and since it's being worked on only by cloud for cloud - I don't see it as an issue. If there is a larger need to do this for all of LOD at some other point in the future, we can jump in and do it. But that's it
TLDR, have nothing to add/feedback here, have fun building it!

Tarrow changed the task status from Stalled to Open.Jan 29 2024, 9:36 AM

We decided we just remove all "extra cleaning logic" for wikibase.cloud and just use the output from sparql.js. This is now reflected here: https://github.com/wbstack/queryservice-ui/pull/237

Fring removed Fring as the assignee of this task.Jan 29 2024, 9:55 AM
Fring moved this task from Doing to In Review on the Wikibase Cloud (Kanban board Q4 2023) board.

For comparison this new version outputs the linked query like this:

PREFIX wbt: <https://metabase.wikibase.cloud/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX bd: <http://www.bigdata.com/rdf#>
SELECT DISTINCT ?item1 ?item1Label ?item2 ?item2Label ?value WHERE {
  ?item1 wbt:P1 ?value.
  ?item2 wbt:P1 ?value.
  FILTER((?item1 != ?item2) && ((STR(?item1)) < (STR(?item2))))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "sv". }
}
Tarrow added a subscriber: Charlie_WMDE.

I think we should tell @Charlie_WMDE about this change before shipping it to production

We've told her! It's all good to go.

Tarrow claimed this task.