Page MenuHomePhabricator

Change entity prefix
Closed, ResolvedPublic

Description

Change prefix entity: http://www.wikidata.org/entity/ to prefix wd: http://www.wikidata.org/entity/ to use same prefix as suggested by query.wikidata.org

This is a copy from https://github.com/smalyshev/wdq2sparql/issues/3 because nothing seems to be happening on that side and this is wasting time. I'm not sure if it's just a matter of changing https://github.com/smalyshev/wdq2sparql/blob/master/Sparql/Syntax/Wikidata.php

Event Timeline

Multichill raised the priority of this task from to High.
Multichill updated the task description. (Show Details)
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Sorry, I did not see the issue on github. I'll take a look.

@Multichill the name of the prefix doesn't actually matter much. wd: is chosen for standard RDF dump format for brevity, but any prefix could be used, they don't have to be the same between queries.

@Multichill the name of the prefix doesn't actually matter much. wd: is chosen for standard RDF dump format for brevity, but any prefix could be used, they don't have to be the same between queries.

I'm perfectly aware of that, but the default namespaces are these by convention:

PREFIX wd: http://www.wikidata.org/entity/
PREFIX wdt: http://www.wikidata.org/prop/direct/
PREFIX wikibase: http://wikiba.se/ontology#
PREFIX p: http://www.wikidata.org/prop/
PREFIX v: http://www.wikidata.org/prop/statement/
PREFIX q: http://www.wikidata.org/prop/qualifier/
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

This tool should follow this convention. See also https://www.wikidata.org/wiki/Topic:St6jlfk7zjv00yu9

Smalyshev lowered the priority of this task from High to Medium.Nov 23 2015, 6:39 PM

Multichill doesn't say exactly where the change needs to be made, but I presume he's talking about the "exploring linked data" page, on Github.

The entity: prefix is also used by the WDQ-to-SPARQL translator.

Of course it's true that the queries still run fine no matter what labels one chooses to give to the prefixes. But I think Multichill is right, that it is very confusing for newcomers, probably encountering these namespaces for the first time, if we are not consistent -- particularly for this namespace, the most fundamental of all -- if we are sometimes using wd: and sometimes entity:, all for no good reason.

I agree that it would be good if all usages were standardised to wd:.

The SPARQL Query Examples page does use wd:

But it is noticeable that it uses v: and q: as the prefixes for http://www.wikidata.org/prop/statement/ and http://www.wikidata.org/prop/qualifier/ when the full list of prefixes reference uses ps: and pq: .

It would be good to standardise on one or the other.

I have to say that I do find many of the prefixes in the full list of prefixes list completely unmemorable -- I have to check the list every time, even after two months. But they are consistent, and do match the actual URLs used in the data dump, so even if they are not intuitive (at least, I don't find them so), we probably should be consistent and use them.

So

  • ...q: for /qualifier, ...r for reference, and ...s for statement are forms of a property p... connecting to a value for a qualifier, a value for a reference, and a value for a statement, respectively
  • ...v: for /value indicates that the property connects not to a value, but to a complicated node object

There is a logic there, but repetition and consistency are needed for it to have a hope of becoming familiar.

Though they might in themselves be simpler, use of alternate forms like v: and q: which break the pattern make it harder to assimilate the whole scheme; and make it harder for users to find the prefix they want, when they later then turn to the full list of prefixes list for reference.

Multichill raised the priority of this task from Medium to High.Dec 6 2015, 12:09 AM
Multichill added a subscriber: Tfinc.

I feel a bit ignored here. The only thing that happened is that the priority got lowered. This is not some random toy tool, this is the tool where we direct users from https://query.wikidata.org/ .

"High" priority are the issues that must be fixed before all else, usually because important functionality is missing or broken. As far as I can see, for this particular issue it is not the case. Example of high priority issue: data corruption in WDQS updates, which I was working on lately. Example of an issue which I consider regular priority issue: an exact value of a prefix in a conversion tool. That does not mean it is ignored - that does mean that this issue is in line with others and does not require exceptional urgent treatment. If you do think it is exceptionally urgent please explain the source of the urgency.

@Jheald: for the dump prefixes being intuitively clear was not the primary concern for dump format. They meant to be consistent and short, as to not inflate the dump size. The real data always has full IRIs, so you are free to use whatever prefix looks convenient to you, there is absolutely no obligation to use the same prefix scheme as the dump does, it is only a suggestion and a guide for understanding the examples and the dump itself. The full IRI should match, but the prefix scheme can be of your own choosing, and use more descriptive names than p: if so desired.

Instead of arguing about priorities you could just fix it. You claimed this task two weeks ago. The current output breaks Listeriabot and that one is used quite a lot.

Instead of arguing about priorities you could just fix it.

Since you enquired about priority, @Smalyshev explained why the priority was lowered. This response from you is combative and not helpful.

Let's stick to the substance of the request, please.

Instead of arguing about priorities you could just fix it.

Since you enquired about priority, @Smalyshev explained why the priority was lowered. This response from you is combative and not helpful.

Let's stick to the substance of the request, please.

I think what @Multichill meant is that if you've assigned the task to yourself it is expected that you are working on it. By unassigning yourself you allow someone else to try and come up with a patch.

That is separate from setting the priority.

Instead of arguing about priorities you could just fix it.

Since you enquired about priority, @Smalyshev explained why the priority was lowered. This response from you is combative and not helpful.

Let's stick to the substance of the request, please.

This is a minor request, probably less than 30 minutes to alter and deploy. It's in a private repository, not in Gerrit so I can't easily push a change myself. Tomorrow this bug has been open for 3 months, so yes I'm quite annoyed by this. Is this really the way WMF want to treat volunteers?

Goodbye

Multichill lowered the priority of this task from High to Lowest.Dec 7 2015, 4:25 PM
Multichill unsubscribed.
Qgil raised the priority of this task from Lowest to Medium.Dec 11 2015, 4:23 PM
Qgil added subscribers: Multichill, Qgil.

I'll set the priority back to Normal, as @Smalyshev had defined it. In case of doubt, this is how prioritizing tasks works.

@Smalyshev claimed the task on Nov 23. It is OK to have a task assigned for a couple of weeks and even more as long as you have an intention to fix it.

@Multichill, even if you may have reasons to be annoyed, there is no reason to be confrontational. Explaining upfront the reasons why you are annoyed is more effective. A week ago we had a task assigned to a maintainer with Normal priority. Now we had a task up for grabs with Lower priority, and several people upset for several reasons. Nobody deserves this. Can we go back some squares, please?

I am not specialist in this topic, but this is what I take from this discussion:

  • WDQS is not following the conventions. This has a potential for added confusion and it is breaking Listeriabot (and I guess therefore might break other both following conventions?)
  • It seems that the maintainers have an explanation for the current reasoning, but are OK to change it (this is how I interpret that fact that task was assigned + Normal)
  • The code is not in Gerrit (why?) and therefore the usual contribution path is not applicable.
  • A report was made in September in GitHub but wasn't addressed there.
  • The fix is apparently simple. Perhaps so simple that it could even be a candidate for Google-Code-In-2015? Or am I complicating things too much here?

Anyway, it is easy to see that everybody involved in this discussion is a very good contributor in their areas. Please try to understand each other. Hopefully this task will be solved soon.

@Qgil, thank you for your kind and reasonable words. The tool in question is (was?) not really a part of official WDQS package, it's just something I slapped together in my free time to enable people to move to SPARQL easier. At least that's what it was at the start, now maybe more people use it so it may be given more status along with gerrit repo, etc. But for now it's on github, where patches are gladly (even if a bit tardily) accepted: https://github.com/smalyshev/wdq2sparql . I am still going to look into it, but I was advised that since I was distracted by higher priority things, it is better to remove the assignment and give others chance to contribute if they want to, and I think that advise made sense. If that does not happen, I will come back to it - probably really soon now, since I'm done with most higher priority things I had on my plate. The fix should be easy, it just takes a bit of time to implement it, test that everything is ok, change the unit tests, etc.

Smalyshev claimed this task.