Page MenuHomePhabricator

v: prefix not correctly prefixed in Wikibase when using entitysource config and extra prefixes
Closed, ResolvedPublicBUG REPORT

Description

Steps to Reproduce:

  • add a statement with property P766 and some date
  • observe wdqs-updater log on the test system via docker logs -f wdqs-updater

Actual Results:

15:59:45.519 [main] INFO  o.w.q.r.t.change.RecentChangesPoller - Got 1 changes, from Q27@105@20200721155942|115 to Q27@105@20200721155942|115
15:59:45.878 [update 3] WARN  org.wikidata.query.rdf.tool.Updater - Contained error syncing.  Giving up on Q27
org.wikidata.query.rdf.tool.exception.ContainedException: RDF parsing error for http://wikidata-federated-properties.wmflabs.org/wiki/Special:EntityData/Q27.ttl?flavor=dump&nocache=1595347185568
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.collectStatementsFromUrl(WikibaseRepository.java:417)
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.fetchRdfForEntity(WikibaseRepository.java:473)
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.fetchRdfForEntity(WikibaseRepository.java:449)
	at org.wikidata.query.rdf.tool.Updater.handleChange(Updater.java:401)
	at org.wikidata.query.rdf.tool.Updater.lambda$fetchDataFromWikibaseAndMunge$7(Updater.java:283)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.openrdf.rio.RDFParseException: Namespace prefix 'v' used but not defined [line 78]
	at org.openrdf.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:440)
	at org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:685)
	at org.openrdf.rio.turtle.TurtleParser.reportFatalError(TurtleParser.java:1405)
	at org.openrdf.rio.helpers.RDFParserBase.getNamespace(RDFParserBase.java:342)
	at org.openrdf.rio.turtle.TurtleParser.parseQNameOrBoolean(TurtleParser.java:1065)
	at org.openrdf.rio.turtle.TurtleParser.parseValue(TurtleParser.java:643)
	at org.openrdf.rio.turtle.TurtleParser.parseObject(TurtleParser.java:527)
	at org.openrdf.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:453)
	at org.openrdf.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:446)
	at org.openrdf.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:409)
	at org.openrdf.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:259)
	at org.openrdf.rio.turtle.TurtleParser.parse(TurtleParser.java:214)
	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.collectStatementsFromUrl(WikibaseRepository.java:408)
	... 8 common frames omitted

Expected Results:

PREFIX fpwdt: <http://wikidata.beta.wmflabs.org/prop/direct/>
SELECT ?item ?itemLabel ?pubDate 
WHERE { ?item fpwdt:P766 ?pubDate . }

RDF output from the Wikibase

1@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
2@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
3@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
4@prefix owl: <http://www.w3.org/2002/07/owl#> .
5@prefix wikibase: <http://wikiba.se/ontology#> .
6@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
7@prefix schema: <http://schema.org/> .
8@prefix cc: <http://creativecommons.org/ns#> .
9@prefix geo: <http://www.opengis.net/ont/geosparql#> .
10@prefix prov: <http://www.w3.org/ns/prov#> .
11@prefix wd: <http://wikidata-federated-properties.wmflabs.org/entity/> .
12@prefix wdtdata: <https://wikidata-federated-properties.wmflabs.org/wiki/Special:EntityData/> .
13@prefix wdts: <http://wikidata-federated-properties.wmflabs.org/entity/statement/> .
14@prefix wdtref: <http://wikidata-federated-properties.wmflabs.org/reference/> .
15@prefix wdtv: <http://wikidata-federated-properties.wmflabs.org/value/> .
16@prefix wdt: <http://wikidata-federated-properties.wmflabs.org/prop/direct/> .
17@prefix wdtn: <http://wikidata-federated-properties.wmflabs.org/prop/direct-normalized/> .
18@prefix wdtp: <http://wikidata-federated-properties.wmflabs.org/prop/> .
19@prefix wdtps: <http://wikidata-federated-properties.wmflabs.org/prop/statement/> .
20@prefix wdtpsv: <http://wikidata-federated-properties.wmflabs.org/prop/statement/value/> .
21@prefix wdtpsn: <http://wikidata-federated-properties.wmflabs.org/prop/statement/value-normalized/> .
22@prefix wdtpq: <http://wikidata-federated-properties.wmflabs.org/prop/qualifier/> .
23@prefix wdtpqv: <http://wikidata-federated-properties.wmflabs.org/prop/qualifier/value/> .
24@prefix wdtpqn: <http://wikidata-federated-properties.wmflabs.org/prop/qualifier/value-normalized/> .
25@prefix wdtpr: <http://wikidata-federated-properties.wmflabs.org/prop/reference/> .
26@prefix wdtprv: <http://wikidata-federated-properties.wmflabs.org/prop/reference/value/> .
27@prefix wdtprn: <http://wikidata-federated-properties.wmflabs.org/prop/reference/value-normalized/> .
28@prefix wdno: <http://wikidata-federated-properties.wmflabs.org/prop/novalue/> .
29@prefix fpwd: <http://wikidata.beta.wmflabs.org/entity/> .
30@prefix fpwddata: <https://wikidata-federated-properties.wmflabs.org/wiki/wd:Special:EntityData/> .
31@prefix fpwds: <http://wikidata.beta.wmflabs.org/entity/statement/> .
32@prefix fpwdref: <http://wikidata.beta.wmflabs.org/reference/> .
33@prefix fpwdv: <http://wikidata.beta.wmflabs.org/value/> .
34@prefix fpwdt: <http://wikidata.beta.wmflabs.org/prop/direct/> .
35@prefix fpwdtn: <http://wikidata.beta.wmflabs.org/prop/direct-normalized/> .
36@prefix fpwdp: <http://wikidata.beta.wmflabs.org/prop/> .
37@prefix fpwdps: <http://wikidata.beta.wmflabs.org/prop/statement/> .
38@prefix fpwdpsv: <http://wikidata.beta.wmflabs.org/prop/statement/value/> .
39@prefix fpwdpsn: <http://wikidata.beta.wmflabs.org/prop/statement/value-normalized/> .
40@prefix fpwdpq: <http://wikidata.beta.wmflabs.org/prop/qualifier/> .
41@prefix fpwdpqv: <http://wikidata.beta.wmflabs.org/prop/qualifier/value/> .
42@prefix fpwdpqn: <http://wikidata.beta.wmflabs.org/prop/qualifier/value-normalized/> .
43@prefix fpwdpr: <http://wikidata.beta.wmflabs.org/prop/reference/> .
44@prefix fpwdprv: <http://wikidata.beta.wmflabs.org/prop/reference/value/> .
45@prefix fpwdprn: <http://wikidata.beta.wmflabs.org/prop/reference/value-normalized/> .
46@prefix fpwdno: <http://wikidata.beta.wmflabs.org/prop/novalue/> .
47
48wdtdata:Q27 a schema:Dataset ;
49 schema:about wd:Q27 ;
50 cc:license <http://creativecommons.org/publicdomain/zero/1.0/> ;
51 schema:softwareVersion "1.0.0" ;
52 schema:version "105"^^xsd:integer ;
53 schema:dateModified "2020-07-21T15:59:42Z"^^xsd:dateTime ;
54 wikibase:statements "2"^^xsd:integer ;
55 wikibase:identifiers "0"^^xsd:integer ;
56 wikibase:sitelinks "0"^^xsd:integer .
57
58wd:Q27 a wikibase:Item ;
59 rdfs:label "Berlin"@en ;
60 skos:prefLabel "Berlin"@en ;
61 schema:name "Berlin"@en ;
62 schema:description "German capital"@en ;
63 fpwdt:P213 "0000 0001 1364 8293" ;
64 fpwdt:P766 "2020-02-02T00:00:00Z"^^xsd:dateTime ;
65 fpwdp:P213 wdts:Q27-6213D96C-058B-409D-AB33-DAB801845338 .
66
67wdts:Q27-6213D96C-058B-409D-AB33-DAB801845338 a wikibase:Statement,
68 wikibase:BestRank ;
69 wikibase:rank wikibase:NormalRank ;
70 fpwdps:P213 "0000 0001 1364 8293" .
71
72wd:Q27 fpwdp:P766 wdts:Q27-bcd29466-4da6-0c6b-fbff-d1c734f43535 .
73
74wdts:Q27-bcd29466-4da6-0c6b-fbff-d1c734f43535 a wikibase:Statement,
75 wikibase:BestRank ;
76 wikibase:rank wikibase:NormalRank ;
77 fpwdps:P766 "2020-02-02T00:00:00Z"^^xsd:dateTime ;
78 fpwdpsv:P766 v:a62110495efb9b8ece26646450e49ca6 .
79
80wdtv:a62110495efb9b8ece26646450e49ca6 a wikibase:TimeValue ;
81 wikibase:timeValue "2020-02-02T00:00:00Z"^^xsd:dateTime ;
82 wikibase:timePrecision "11"^^xsd:integer ;
83 wikibase:timeTimezone "0"^^xsd:integer ;
84 wikibase:timeCalendarModel <http://www.wikidata.org/entity/Q1985727> .

Event Timeline

same issue on https://eu-invasive-species-federated-properties.wmflabs.org/wiki/Item:Q67

wdqs-updater     | 16:24:54.905 [main] INFO  o.w.q.r.t.change.RecentChangesPoller - Got 1 changes, from Q67@70@20200721162449|70 to Q67@70@20200721162449|70
wikibase         | 172.20.0.5 - - [21/Jul/2020:16:24:54 +0000] "GET /w/api.php?format=json&action=query&list=recentchanges&rcdir=newer&rcprop=title%7Cids%7Ctimestamp&rcnamespace=120%7C122&rclimit=100&continue=&rcstart=2020-07-21T15%3A49%3A04Z HTTP/1.1" 200 711 "-" "Wikidata Query Service Updater Bot"
wikibase         | 172.20.0.5 - - [21/Jul/2020:16:24:54 +0000] "GET /wiki/Special:EntityData/Q67.ttl?flavor=dump&nocache=1595348694967 HTTP/1.1" 200 1552 "-" "Wikidata Query Service Updater Bot"
wdqs-updater     | 16:24:55.180 [update 6] WARN  org.wikidata.query.rdf.tool.Updater - Contained error syncing.  Giving up on Q67
wdqs-updater     | org.wikidata.query.rdf.tool.exception.ContainedException: RDF parsing error for http://eu-invasive-species-federated-properties.wmflabs.org/wiki/Special:EntityData/Q67.ttl?flavor=dump&nocache=1595348694967
wdqs-updater     | 	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.collectStatementsFromUrl(WikibaseRepository.java:417)
wdqs-updater     | 	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.fetchRdfForEntity(WikibaseRepository.java:473)
wdqs-updater     | 	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.fetchRdfForEntity(WikibaseRepository.java:449)
wdqs-updater     | 	at org.wikidata.query.rdf.tool.Updater.handleChange(Updater.java:401)
wdqs-updater     | 	at org.wikidata.query.rdf.tool.Updater.lambda$fetchDataFromWikibaseAndMunge$7(Updater.java:283)
wdqs-updater     | 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
wdqs-updater     | 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
wdqs-updater     | 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
wdqs-updater     | 	at java.lang.Thread.run(Thread.java:748)
wdqs-updater     | Caused by: org.openrdf.rio.RDFParseException: Namespace prefix 'v' used but not defined [line 78]
wdqs-updater     | 	at org.openrdf.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:440)
wdqs-updater     | 	at org.openrdf.rio.helpers.RDFParserBase.reportFatalError(RDFParserBase.java:685)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.reportFatalError(TurtleParser.java:1405)
wdqs-updater     | 	at org.openrdf.rio.helpers.RDFParserBase.getNamespace(RDFParserBase.java:342)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parseQNameOrBoolean(TurtleParser.java:1065)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parseValue(TurtleParser.java:643)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parseObject(TurtleParser.java:527)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:453)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:446)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:409)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:259)
wdqs-updater     | 	at org.openrdf.rio.turtle.TurtleParser.parse(TurtleParser.java:214)
wdqs-updater     | 	at org.wikidata.query.rdf.tool.wikibase.WikibaseRepository.collectStatementsFromUrl(WikibaseRepository.java:408)
wdqs-updater     | 	... 8 common frames omitted

Highly likley that this relates to https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Wikibase/+/533202
It looks like this is also an issue on Commons RDF output, probably the same bug.

It's harder to spot on commons as v: is defined as a wikidata prefix, so the RDF is parsable.
vs for fed props where the RDF becomes invalid.

On commons:

@prefix v: <http://www.wikidata.org/value/> .
@prefix sdcv: <https://commons.wikimedia.org/wiki/Special:EntityData/value/> .

sdcs:M12132133-3690BC73-7B13-418C-AD7B-08D701E4F396 a wikibase:Statement,
		wikibase:BestRank ;
	wikibase:rank wikibase:NormalRank ;
	ps:P571 "2010-11-22T00:00:00Z"^^xsd:dateTime ;
	psv:P571 v:c7adec55d30e3c621b0433327cf7846c .
Addshore renamed this task from wdqs-updater fails updating item on fedProps test system to v: prefix not correctly prefixed in Wikibase when using entitysource config and extra prefixes.Jul 22 2020, 9:25 AM

The is "somehow" affecting commons as well eventhough the RDF on commons remains "parseable".

Here is a extract of https://commons.wikimedia.org/wiki/Special:EntityData/M211196.ttl?flavor=dump to illustrate the issue:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix wikibase: <http://wikiba.se/ontology#> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix schema: <http://schema.org/> .
@prefix cc: <http://creativecommons.org/ns#> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix wd: <http://www.wikidata.org/entity/> .
@prefix data: <https://www.wikidata.org/wiki/Special:EntityData/> .
@prefix s: <http://www.wikidata.org/entity/statement/> .
@prefix ref: <http://www.wikidata.org/reference/> .
@prefix v: <http://www.wikidata.org/value/> .
@prefix wdt: <http://www.wikidata.org/prop/direct/> .
@prefix wdtn: <http://www.wikidata.org/prop/direct-normalized/> .
@prefix p: <http://www.wikidata.org/prop/> .
@prefix ps: <http://www.wikidata.org/prop/statement/> .
@prefix psv: <http://www.wikidata.org/prop/statement/value/> .
@prefix psn: <http://www.wikidata.org/prop/statement/value-normalized/> .
@prefix pq: <http://www.wikidata.org/prop/qualifier/> .
@prefix pqv: <http://www.wikidata.org/prop/qualifier/value/> .
@prefix pqn: <http://www.wikidata.org/prop/qualifier/value-normalized/> .
@prefix pr: <http://www.wikidata.org/prop/reference/> .
@prefix prv: <http://www.wikidata.org/prop/reference/value/> .
@prefix prn: <http://www.wikidata.org/prop/reference/value-normalized/> .
@prefix wdno: <http://www.wikidata.org/prop/novalue/> .
@prefix sdc: <https://commons.wikimedia.org/wiki/Special:EntityData/> .
@prefix sdcdata: <https://commons.wikimedia.org/wiki/Special:EntityData/> .
@prefix sdcs: <https://commons.wikimedia.org/wiki/Special:EntityData/statement/> .
@prefix sdcref: <https://commons.wikimedia.org/wiki/Special:EntityData/reference/> .
@prefix sdcv: <https://commons.wikimedia.org/wiki/Special:EntityData/value/> .
@prefix sdct: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/direct/> .
@prefix sdctn: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/direct-normalized/> .
@prefix sdcp: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/> .
@prefix sdcps: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/statement/> .
@prefix sdcpsv: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/statement/value/> .
@prefix sdcpsn: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/statement/value-normalized/> .
@prefix sdcpq: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/qualifier/> .
@prefix sdcpqv: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/qualifier/value/> .
@prefix sdcpqn: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/qualifier/value-normalized/> .
@prefix sdcpr: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/reference/> .
@prefix sdcprv: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/reference/value/> .
@prefix sdcprn: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/reference/value-normalized/> .
@prefix sdcno: <https://commons.wikimedia.org/wiki/Special:EntityData/prop/novalue/> .

sdcdata:M211196 a schema:Dataset ;
	schema:about sdc:M211196 ;
	cc:license <http://creativecommons.org/publicdomain/zero/1.0/> ;
	schema:softwareVersion "1.0.0" ;
	schema:version "427450652"^^xsd:integer ;
	schema:dateModified "2020-06-19T04:58:57Z"^^xsd:dateTime .

sdc:M211196 a wikibase:Mediainfo,
		schema:MediaObject,
		schema:ImageObject ;
	schema:encodingFormat "image/jpeg" ;
	schema:contentUrl <https://upload.wikimedia.org/wikipedia/commons/e/e4/Pillnitz_5.jpg> ;
	schema:contentSize "1330817"^^xsd:integer ;
	schema:height "2048"^^xsd:integer ;
	schema:width "1536"^^xsd:integer ;
	wdt:P571 "2004-07-25T00:00:00Z"^^xsd:dateTime ;
	wdt:P7482 wd:Q66458942 ;
	wdt:P6216 wd:Q50423863 ;
	wdt:P275 wd:Q50829104,
		wd:Q14946043 ;
	p:P571 sdcs:M211196-CB805361-B325-42C5-94D4-113F42BFFA2D .

sdcs:M211196-CB805361-B325-42C5-94D4-113F42BFFA2D a wikibase:Statement,
		wikibase:BestRank ;
	wikibase:rank wikibase:NormalRank ;
	ps:P571 "2004-07-25T00:00:00Z"^^xsd:dateTime ;
	psv:P571 v:0070284a6ac7b1211aa18c76e9073347 .

sdcv:0070284a6ac7b1211aa18c76e9073347 a wikibase:TimeValue ;
	wikibase:timeValue "2004-07-25T00:00:00Z"^^xsd:dateTime ;
	wikibase:timePrecision "11"^^xsd:integer ;
	wikibase:timeTimezone "0"^^xsd:integer ;
	wikibase:timeCalendarModel <http://www.wikidata.org/entity/Q1985727> .

The problematic triple is

sdcs:M211196-CB805361-B325-42C5-94D4-113F42BFFA2D psv:P571 v:0070284a6ac7b1211aa18c76e9073347

it improperly links to a value owned wikidata (v: <http://www.wikidata.org/value/>) while it should be from commons. The proper prefix is used when declaring the value itself sdcv:0070284a6ac7b1211aa18c76e9073347.

The RDF remains valid but the link between the statement and its value is broken and will be unusable in the query service.

side note: the sdc prefixes pasted here are affected by T258474

Change 615536 had a related patch set uploaded (by Silvan Heintze; owner: Silvan Heintze):
[mediawiki/extensions/Wikibase@master] Fix bug that causes wrong prefixes in RDF output

https://gerrit.wikimedia.org/r/615536

Change 615536 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Fix bug that causes wrong prefixes in RDF output

https://gerrit.wikimedia.org/r/615536

Change 615443 had a related patch set uploaded (by DCausse; owner: Silvan Heintze):
[mediawiki/extensions/Wikibase@wmf/1.36.0-wmf.1] Fix bug that causes wrong prefixes in RDF output

https://gerrit.wikimedia.org/r/615443

Change 615443 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@wmf/1.36.0-wmf.1] Fix bug that causes wrong prefixes in RDF output

https://gerrit.wikimedia.org/r/615443

Mentioned in SAL (#wikimedia-operations) [2020-07-23T11:31:53Z] <dcausse@deploy1001> Synchronized php-1.36.0-wmf.1/extensions/Wikibase: T258507: Fix bug that causes wrong prefixes in RDF output (duration: 01m 11s)

Change 615445 had a related patch set uploaded (by Addshore; owner: Silvan Heintze):
[mediawiki/extensions/Wikibase@REL1_35] Fix bug that causes wrong prefixes in RDF output

https://gerrit.wikimedia.org/r/615445

I'm also backporting this to the 1.35 branch

Change 615445 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@REL1_35] Fix bug that causes wrong prefixes in RDF output

https://gerrit.wikimedia.org/r/615445