Page MenuHomePhabricator

Decide which prefixes to use for MediaInfo RDF
Closed, ResolvedPublic

Description

Right now the code that generates RDF for wikibase uses hardcoded prefixes for entities and other nodes - wdata:, wd:, wdt: etc. However, this may be confusing when used with MediaInfo/commons, for several reasons:

  1. MediaInfo entities (M1234) are not Wikidata, and wd: kind of implies Wikidata.
  2. If wd: prefix is linked to commons entity URI (http://commons.wikimedia.org/entity/) it can not be used for Wikidata entities, because these use http://www.wikida.org/entity/), so writing queries using these items will be confusing as another prefix should be used for those.
  3. Doing federated queries against Wikidata data set would be even more confusing since it won't be clear which wd: means what where.

Same is true for some measure for other prefixes too, but wd: is most important, since Commons does not have now it's own properties, so all predicates still can use Wikidata ones, and wds: predicate which is used with statements is rarely used in queries (though it needs to be taken into account too).

Thus, we have the following option:

A. Leave everything as is and try to deal with the confusion by education and writing smarter queries, etc.

B. Choose different prefixes for SDC/Mediainfo, e.g. sdc:, sdcdata:, etc. Not sure whether is makes sense to also change statement and value/reference ones, since it is not very likely they'd refer to each other, and Commons is not even using them currently.

C. Use prefix-suffix system, like c-wd:, for all Commons prefixes.

D. Something else?

Note that this is distinct issue from T222306 - we know which full URI will be in the database, but this is the question how it would be represented in export formats that use prefixes to be human-readable - like TTL - and how the queries and query service configuration (which includes default prefixes) would be set up.

Also note that, strictly speaking, prefixes in TTL format do not need to match prefixes in query (they are internally resolved to full URIs anyway), so we can do different approaches in different areas, though it again will be somewhat confusing if we take this road.

Related Objects

Event Timeline

I think we should rule out option A - it's just too confusing, and trying to educate our way around something inherently confusing seems like a waste of time

Seems to me that for now at least all we have to do is choose a new prefix for MediaInfo items themselves. To me it seems most sensible to create a new one rather than something containing wd

I would vote for either option B (sdc:, sdcdata:) or option C (c-wd: ).

Tpt added a comment.May 13 2019, 6:09 PM

Option B seems the most usable to me and the most consistent. Prefixes like "c-wd" has the disadvantage of still having "wikidata" in it, and it's imho quite confusing.

Looks like the preference is leaning towards sdc: - which is also my preference.

Smalyshev moved this task from Backlog to Doing on the User-Smalyshev board.May 19 2019, 10:07 AM
Smalyshev triaged this task as Normal priority.
Smalyshev moved this task from Doing to Done on the User-Smalyshev board.May 23 2019, 3:47 PM
Smalyshev closed this task as Resolved.Mon, Jun 10, 10:34 PM
Smalyshev claimed this task.