Page MenuHomePhabricator

Wikibase/Wikidata and WDQS disagree about statement, reference and value namespace prefixes
Closed, ResolvedPublic

Description

Problem

The Wikibase RDF export on Wikidata uses s:, ref: and v: as the namespace prefixes for statement, reference and value nodes respectively:

$ curl -s 'https://www.wikidata.org/wiki/Special:EntityData/Q28726133.ttl?flavor=dump&revision=2163285707' | grep -E '@prefix (s|ref|v):'
@prefix s: <http://www.wikidata.org/entity/statement/> .
@prefix ref: <http://www.wikidata.org/reference/> .
@prefix v: <http://www.wikidata.org/value/> .

However, the Wikidata Query Service instead knows these same prefixes as wds:, wdref: and wdv:, as can be seen in this query:

(P72202 demonstrates that WDQS doesn’t know the prefixes without the wd part and will raise a syntax error if you try to use them.)

This is a problem for Wikibase-Quality-Constraints, because we’d like to use the WikibaseRepo RdfVocabulary service to generate queries that we can send to WDQS without having to define the prefixes in every query we send – but that doesn’t work if the two disagree about what the prefixes are. (And we want to start using v: / wdv: for the new SPARQL queries in T369079.) It would be great if we could reconcile this somehow, to let WBQC keep sending shorter queries, and also just generally to reduce confusion.

Prior history

I think this difference is due to a bug in Wikibase. The prefix generated by Wikibase used to be wdref, as can be seen in e.g. Wikibase/Indexing/RDF Dump Format/Proposal (linked from change I8651435e8c, which changed NS_REFERENCE from ref: to wdref: way back in April 2015, around the time of the initial query service deployment). Later, in mid-2019, the prefixes were made more configurable for T211799 / T214557; Wikidata should keep its existing prefixes (I assume – I didn’t find a mention of intentional changes for Wikidata but I confess I haven’t read the whole discussion), while Structured Data on Commons needed different ones.

The configuration mainly consists of two prefix strings (prefix-prefixes?) – the rdfNodeNamespacePrefix, and the rdfPredicateNamespacePrefix. On Wikidata, rdfNodeNamespacePrefix is wd (yielding wd: for entities, wdt: for direct claims, wdno: for novalue etc.), while rdfPredicateNamespacePrefix is the empty string (yielding p:, ps:, psv: etc.); whereas on Commons, rdfNodeNamespacePrefix is sdc (yielding sdc: for MediaInfo entities, sdct:, sdcno: etc.), while rdfPredicateNamespacePrefix is also sdc (yielding sdcp:, sdcps:, sdcpsv: etc.). And I think as part of that massive Wikibase change, we accidentally used the predicate prefix rather than the node prefix for the three prefixes which this task is about, even though all of them are nodes and not predicates (statement nodes, reference nodes, and value nodes). If the node prefix is used, they match WDQS again: wds:, wdref:, and wdv:.

The only previous mention of this issue that I’ve found is T297096, reported by @VladimirAlexiev and partially-dismissed by yours truly. The task was about the WBQC RDF export (which is currently unused: T274982); apparently neither of us noticed that the discrepancy also existed with Wikibase itself.

Stability concerns

In T297096#7549331, I wrote that “Prefixes are local to a single RDF document, there’s no requirement to use the same prefix names between different documents as far as I’m aware”. This is half true – in the Wikibase RDF export, I think we’re theoretically free to change the prefixes as we please. I’m sure there are some folks out there who parse the Wikidata RDF dumps with various ill-advised regexes, rather than a proper RDF parser, and who’ve hard-coded the current prefixes (ignoring the @prefix declarations in the output) and who would be broken if we changed the prefixes – but we can follow the usual notification policy to alert them.

But the situation in WDQS is different. Because WDQS allows users to write their SPARQL queries without specifying the standard prefixes (an excellent usability feature – and one that’s implemented in the backend, not in the Wikidata Query UI), changing or removing a prefix runs the risk of breaking existing queries that were relying on that prefix. Adding a new prefix should be possible without breaking any queries, but having two prefixes for the same URI (e.g. s: and wds:) seems unnecessarily confusing. (For the three particular prefixes that this task is concerned with, the number of affected queries is probably relatively low, as these nodes use opaque hashes in the URI and generally wouldn’t be named in a query. But I’m sure there are a few people who query for a hard-coded statement, reference or value node.)

Suggested change

For these reasons, I suggest that Wikibase should change these three prefixes to wds:, wdref: and wdv:, to match both its own former output (prior to August 2019) and the Wikidata Query Service. We should treat this as either a significant or breaking change per the stable interface policy (to be decided), announce it in advance, and give users the opportunity to test the new behavior on Test Wikidata first, as per the usual procedure. (There’s no Test Wikidata Query Service, but as the change wouldn’t affect the query service, that should be fine.)

sketch of RdfVocabulary:__construct() during the transition
$buggyNodePrefix = $tmpFeatureFlag ? $nodeNamespacePrefix : $predicateNamespacePrefix;
$this->statementNamespaceNames[$repositoryOrSourceName] = [
	self::NS_STATEMENT => $buggyNodePrefix . self::NS_STATEMENT,
	self::NS_REFERENCE => $buggyNodePrefix . self::NS_REFERENCE,
	self::NS_VALUE => $buggyNodePrefix . self::NS_VALUE,
];

Details

Related Changes in Gerrit:
SubjectRepoBranchLines +/-
operations/mediawiki-configmaster+0 -10
mediawiki/extensions/Wikibasewmf/1.44.0-wmf.19+21 -68
mediawiki/extensions/Wikibasemaster+21 -68
operations/mediawiki-configmaster+1 -7
operations/mediawiki-configmaster+1 -0
mediawiki/extensions/Wikibasewmf/1.44.0-wmf.16+9 -4
mediawiki/extensions/Wikibasewmf/1.44.0-wmf.16+27 -8
mediawiki/extensions/Wikibasewmf/1.44.0-wmf.16+48 -40
mediawiki/extensions/Wikibasewmf/1.44.0-wmf.16+52 -3
operations/mediawiki-configmaster+15 -0
mediawiki/extensions/Wikibasemaster+15 -8
mediawiki/extensions/Wikibasemaster+9 -4
mediawiki/extensions/Wikibasemaster+27 -8
mediawiki/extensions/Wikibasemaster+52 -3
mediawiki/extensions/Wikibasemaster+48 -40
mediawiki/extensions/WikibaseQualityConstraintsmaster+12 -6
mediawiki/extensions/WikibaseQualityConstraintsmaster+36 -15
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

From the daily: Let's align the Wikibase RDF export on Wikidata with the prefixes used on Wikidata Query Service and we can send a significant change update when we roll it out.

Change #1114003 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Add config option to fix s:, ref:, v: namespace prefix

https://gerrit.wikimedia.org/r/1114003

Change #1118081 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Add config option to make somevalue hashes use URI

https://gerrit.wikimedia.org/r/1118081

Change #1118082 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Make somevalue hashes use URI in tests

https://gerrit.wikimedia.org/r/1118082

Change #1118083 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Fix s:, ref:, v: namespace prefix in tests

https://gerrit.wikimedia.org/r/1118083

Change #1118084 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] DNM: Clean up RDF feature flags again

https://gerrit.wikimedia.org/r/1118084

Change #1118120 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseQualityConstraints@master] Somewhat improve RdfVocabulary usage in SparqlHelper

https://gerrit.wikimedia.org/r/1118120

Change #1118134 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@master] Improve RdfVocabulary usage in addUnitConversions.php

https://gerrit.wikimedia.org/r/1118134

Change #1118147 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/WikibaseQualityConstraints@master] Correctly use RdfVocabulary in CheckConstraintsRdf

https://gerrit.wikimedia.org/r/1118147

End-of-day update: I think the attached changes are pretty much ready for review, except that the top one (DNM: Clean up RDF feature flags again) should have its commit message adjusted; also, we still need config changes for this. (The patches introduce two separate config options, but the plan would be to set them both together in the production config at the same time: first on Test Wikidata, and then ca. two weeks later on Wikidata.) And we can start to prepare the announcement.

Change #1118484 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Enable fixed Wikibase RDF on Beta

https://gerrit.wikimedia.org/r/1118484

Change #1118485 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Enable fixed Wikibase RDF on Test Wikidata

https://gerrit.wikimedia.org/r/1118485

Change #1118486 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Enable fixed Wikibase RDF everywhere

https://gerrit.wikimedia.org/r/1118486

Change #1118487 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[operations/mediawiki-config@master] Remove Wikibase fixed RDF feature flag again

https://gerrit.wikimedia.org/r/1118487

Config changes uploaded; should be fully ready for review now.

Change #1118120 merged by jenkins-bot:

[mediawiki/extensions/WikibaseQualityConstraints@master] Somewhat improve RdfVocabulary usage in SparqlHelper

https://gerrit.wikimedia.org/r/1118120

Change #1118147 merged by jenkins-bot:

[mediawiki/extensions/WikibaseQualityConstraints@master] Correctly use RdfVocabulary in CheckConstraintsRdf

https://gerrit.wikimedia.org/r/1118147

Change #1118081 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Add config option to make somevalue hashes use URI

https://gerrit.wikimedia.org/r/1118081

Change #1118082 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Make somevalue hashes use URI in tests

https://gerrit.wikimedia.org/r/1118082

Change #1114003 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Add config option to fix s:, ref:, v: namespace prefix

https://gerrit.wikimedia.org/r/1114003

Change #1118083 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Fix s:, ref:, v: namespace prefix in tests

https://gerrit.wikimedia.org/r/1118083

Change #1118134 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Improve RdfVocabulary usage in addUnitConversions.php

https://gerrit.wikimedia.org/r/1118134

Change #1118484 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable fixed Wikibase RDF on Beta

https://gerrit.wikimedia.org/r/1118484

Mentioned in SAL (#wikimedia-operations) [2025-02-12T14:06:35Z] <lucaswerkmeister-wmde@deploy2002> Started scap sync-world: Backport for [[gerrit:1118484|Enable fixed Wikibase RDF on Beta (T384344)]]

Mentioned in SAL (#wikimedia-operations) [2025-02-12T14:09:35Z] <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde: Backport for [[gerrit:1118484|Enable fixed Wikibase RDF on Beta (T384344)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-02-12T14:17:10Z] <lucaswerkmeister-wmde@deploy2002> Finished scap sync-world: Backport for [[gerrit:1118484|Enable fixed Wikibase RDF on Beta (T384344)]] (duration: 10m 35s)

Change #1119508 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Add config option to make somevalue hashes use URI

https://gerrit.wikimedia.org/r/1119508

Change #1119509 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Make somevalue hashes use URI in tests

https://gerrit.wikimedia.org/r/1119509

Change #1119510 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Add config option to fix s:, ref:, v: namespace prefix

https://gerrit.wikimedia.org/r/1119510

Change #1119511 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Fix s:, ref:, v: namespace prefix in tests

https://gerrit.wikimedia.org/r/1119511

Change #1119508 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Add config option to make somevalue hashes use URI

https://gerrit.wikimedia.org/r/1119508

Change #1119509 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Make somevalue hashes use URI in tests

https://gerrit.wikimedia.org/r/1119509

Change #1119510 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Add config option to fix s:, ref:, v: namespace prefix

https://gerrit.wikimedia.org/r/1119510

Change #1119511 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.16] Fix s:, ref:, v: namespace prefix in tests

https://gerrit.wikimedia.org/r/1119511

Mentioned in SAL (#wikimedia-operations) [2025-02-13T15:18:02Z] <lucaswerkmeister-wmde@deploy2002> Started scap sync-world: Backport for [[gerrit:1119508|Add config option to make somevalue hashes use URI (T384344)]], [[gerrit:rGERRIT1119509ff0ee|Make somevalue hashes use URI in tests (T384344)]], [[gerrit:1119510|Add config option to fix s:, ref:, v: namespace prefix (T384344)]], [[gerrit:1119511|Fix s:, ref:, v: namespace prefix in tests (T384344)]]

Mentioned in SAL (#wikimedia-operations) [2025-02-13T15:20:49Z] <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde: Backport for [[gerrit:1119508|Add config option to make somevalue hashes use URI (T384344)]], [[gerrit:rGERRIT1119509ff0ee|Make somevalue hashes use URI in tests (T384344)]], [[gerrit:1119510|Add config option to fix s:, ref:, v: namespace prefix (T384344)]], [[gerrit:1119511|Fix s:, ref:, v: namespace prefix in tests (T384344)]] synced to the testservers (https://wikitech.

Mentioned in SAL (#wikimedia-operations) [2025-02-13T15:29:21Z] <lucaswerkmeister-wmde@deploy2002> Finished scap sync-world: Backport for [[gerrit:1119508|Add config option to make somevalue hashes use URI (T384344)]], [[gerrit:rGERRIT1119509ff0ee|Make somevalue hashes use URI in tests (T384344)]], [[gerrit:1119510|Add config option to fix s:, ref:, v: namespace prefix (T384344)]], [[gerrit:1119511|Fix s:, ref:, v: namespace prefix in tests (T384344)]] (duration: 11m 19s)

Change #1118485 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable fixed Wikibase RDF on Test Wikidata

https://gerrit.wikimedia.org/r/1118485

Mentioned in SAL (#wikimedia-operations) [2025-02-17T14:50:24Z] <lucaswerkmeister-wmde@deploy2002> Started scap sync-world: Backport for [[gerrit:rAPAW111848581ee7|Enable fixed Wikibase RDF on Test Wikidata (T384344)]]

Mentioned in SAL (#wikimedia-operations) [2025-02-17T14:54:52Z] <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde: Backport for [[gerrit:rAPAW111848581ee7|Enable fixed Wikibase RDF on Test Wikidata (T384344)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-02-17T15:02:49Z] <lucaswerkmeister-wmde@deploy2002> Finished scap sync-world: Backport for [[gerrit:rAPAW111848581ee7|Enable fixed Wikibase RDF on Test Wikidata (T384344)]] (duration: 12m 24s)

The fixed version is deployed to Test Wikidata:

$ curl -s 'https://test.wikidata.org/wiki/Special:EntityData/Q469.ttl?flavor=dump' | grep -E '@prefix (wd)?(v|ref|s):'
@prefix wds: <http://test.wikidata.org/entity/statement/> .
@prefix wdref: <http://test.wikidata.org/reference/> .
@prefix wdv: <http://test.wikidata.org/value/> .
$ curl -s 'https://www.wikidata.org/wiki/Special:EntityData/Q42.ttl?flavor=dump' | grep -E '@prefix (wd)?(v|ref|s):'
@prefix s: <http://www.wikidata.org/entity/statement/> .
@prefix ref: <http://www.wikidata.org/reference/> .
@prefix v: <http://www.wikidata.org/value/> .

We’re planning to deploy it to Wikidata in two weeks (the announcement should go out later today).

Change #1118486 merged by jenkins-bot:

[operations/mediawiki-config@master] Enable fixed Wikibase RDF everywhere

https://gerrit.wikimedia.org/r/1118486

Mentioned in SAL (#wikimedia-operations) [2025-03-03T14:22:42Z] <lucaswerkmeister-wmde@deploy2002> Started scap sync-world: Backport for [[gerrit:1118486|Enable fixed Wikibase RDF everywhere (T384344)]]

The fixed version is deployed to Wikidata:

$ curl -s 'https://www.wikidata.org/wiki/Special:EntityData/Q42.ttl?flavor=dump' | grep -E '@prefix (wd)?(v|ref|s):'
@prefix wds: <http://www.wikidata.org/entity/statement/> .
@prefix wdref: <http://www.wikidata.org/reference/> .
@prefix wdv: <http://www.wikidata.org/value/> .

Now we can merge the cleanup patches and resume work on T369079.

Change #1118084 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@master] Clean up RDF feature flags again

https://gerrit.wikimedia.org/r/1118084

Change #1125408 had a related patch set uploaded (by Lucas Werkmeister (WMDE); author: Lucas Werkmeister (WMDE)):

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.19] Clean up RDF feature flags again

https://gerrit.wikimedia.org/r/1125408

Change #1125408 merged by jenkins-bot:

[mediawiki/extensions/Wikibase@wmf/1.44.0-wmf.19] Clean up RDF feature flags again

https://gerrit.wikimedia.org/r/1125408

Mentioned in SAL (#wikimedia-operations) [2025-03-10T14:19:48Z] <lucaswerkmeister-wmde@deploy2002> Started scap sync-world: Backport for [[gerrit:1125408|Clean up RDF feature flags again (T384344)]]

Mentioned in SAL (#wikimedia-operations) [2025-03-10T14:22:28Z] <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde: Backport for [[gerrit:1125408|Clean up RDF feature flags again (T384344)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Change #1118487 merged by jenkins-bot:

[operations/mediawiki-config@master] Remove Wikibase fixed RDF feature flag again

https://gerrit.wikimedia.org/r/1118487

Mentioned in SAL (#wikimedia-operations) [2025-03-11T13:44:17Z] <lucaswerkmeister-wmde@deploy2002> Started scap sync-world: Backport for [[gerrit:1118487|Remove Wikibase fixed RDF feature flag again (T384344)]]

Mentioned in SAL (#wikimedia-operations) [2025-03-11T13:47:12Z] <lucaswerkmeister-wmde@deploy2002> lucaswerkmeister-wmde: Backport for [[gerrit:1118487|Remove Wikibase fixed RDF feature flag again (T384344)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)

Mentioned in SAL (#wikimedia-operations) [2025-03-11T13:53:49Z] <lucaswerkmeister-wmde@deploy2002> Finished scap sync-world: Backport for [[gerrit:1118487|Remove Wikibase fixed RDF feature flag again (T384344)]] (duration: 09m 31s)