Page MenuHomePhabricator

Provide more useful redirects for property predicates (wdt:…, p:…, etc.)
Open, Needs TriagePublic

Description

As a generic RDF user, I want to be able to get RDF data from URIs used in Wikidata’s RDF export, without requiring any Wikidata-specific knowledge. (This was reported on Project Chat by User:Jmvanel, see What is the URL of the RDF representation a property?.)

Problem:
Our property predicate URIs (/prop/P279, /prop/direct/P279, etc.) all redirect to /wiki/Property:P279, which is the wiki page for the property and not the content negotiation endpoint. It would be more useful for generic RDF users if they redirected to the /entity/ URI.

BDD
GIVEN I am a generic RDF user or tool
WHEN I follow the URI for a predicate used by Wikidata
AND I send an HTTP header like Accept: text/turtle
THEN I am eventually redirected to the Turtle representation of the corresponding property’s data

Acceptance criteria:

  • There are no more redirects directly into the Property: namespace.

Open questions:

  • Redirect to /entity/ (wd:) or to /wiki/Special:EntityData (wdata:)? I guess /entity/ makes more sense.

Event Timeline

Is there is a test instance where the latest developments are applied ?
Is the FIX expected to take long ?

NOTE: I'll be testing that the properties' label appear in my generic RDF tool : http://semantic-forms.cc:9112/search?q=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ10675206&clas=

I want to report a similar problem, may I use * / Create task (simple ) (above) ?

The proposed changed is a useful workaround;
but in principle, the RDF data retrieved by content negotiation from an entity URI should bear the reference property URI's not secondary ones.
In by the way, no other RDF dataset I know of has several URI's for the same property.

We are 3 months later , and nothing happened.
For RDF users , this is really blocking.

Not only does it cause generic tools to display badly WKD RDF , example here Q625994 the "conference" entity :
http://semantic-forms.cc:1952/display?displayuri=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ625994

but it shows some misunderstanding of what is RDF and LOD about.
It's all about stable URI's, and having several URI's for a single property is bad .

Concretely, the RDF triples returned by Wikidata for this "conference" entity :
http://semantic-forms.cc:1952/download?url=http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ625994&syntax=Turtle
use property URI's such as
http://www.wikidata.org/prop/P279
that is, using prefix

@prefix p: <http://www.wikidata.org/prop/> .

when it should be http://www.wikidata.org/entity/P279 .

For RDF users , this is really blocking.

Any RDF user that really cares specifically about Wikidata can trivially unblock this on their end by adjusting URLs as necessary, loading the URL http://www.wikidata.org/entity/P279 (or directly https://www.wikidata.org/wiki/Special:EntityData/P279, the content negotiation endpoint) when encountering an URI like http://www.wikidata.org/prop/…/P279.

It's all about stable URI's, and having several URI's for a single property is bad.

Wikibase properties are not directly RDF properties (as far as I understand). Instead, each Wikibase property corresponds to several predicates, each of which is its own RDF property (either owl:ObjectProperty or owl:DatatypeProperty): wdt:P279 is the predicate linking a subject to the best (simple) value for that (Wikibase) property, p:P279 is the predicate linking a subject to all statement nodes for the statements with that (Wikibase) property, ps:P279 is the predicate linking from a statement node to the (simple) value for that (Wikibase) property, etc. (see the documentation for the full list).

I can pretty much understand that Wikibase has several URI's for the same property (or closely related ones) .
Wikibase is a special purpose database on its own, but regarding the RDF view offered by BlazeGraph server, there is indeed a problem regarding the coherence of the database.
Again, when there is a triple:

<s> <p> <o> .

Any RDF user expects to use the same <p> URI to access to the vocabulary (ontology) information about domains, ranges ,etc.
It can happen that there are several <o> values, because RDF is inherently multi-valued, but that is not a problem.

I checked that there is no problem regarding the coherence of the database is between what SPARQL outputs and what entity URI's output.
Consider this simple query:

PREFIX wikidata: <http://www.wikidata.org/entity/>
SELECT * WHERE {
  wikidata:Q625994 ?pred ?obj .
}

Run it in YasGUI (generic tool)

One obtains predicates like
http://www.wikidata.org/prop/direct/P279

And using the entity URI with RDF content negociation , one gets the same predicate.

wget  --header='Accept: text/turtle'  http://www.wikidata.org/entity/Q625994

But both should be what I understand as the "generic" URI in Wikidata:
http://www.wikidata.org/entity/P279

Or , another solution,
http://www.wikidata.org/prop/direct/P279
can be considered the "generic" URI in Wikidata, and it gives access to the ontology information.

But /prop/direct/P279 gives no RDF, just HTML :

wget  --header='Accept: text/turtle' http://www.wikidata.org/prop/direct/P279

Only http://www.wikidata.org/entity/P279 works this way (with RDF content negociation) .

But /prop/direct/P279 gives no RDF, just HTML

Yes, that is what this task is about, see the task description.

Sure, the action planned will make a progress, when this happens ...

But from a RDF point of view the mapping from Wikibase to RDF is not satisfying.
Wikibase can have the complexity that it needs for its purposes, but RDF is about simplicity, and unique stable URI's.

We are 1,5 year after the original query.
This is probably a few lines to change.
I wonder what prevents to do it.
This will make Wikidata much more discoverable for Semantic Web tools.