Problem:
A user pointed out on the Project Chat (link, permalink) that the current RDF exports are not valid RDF.
Specifically, running the following code (with rdflib 4.2.2):
G = rdflib.Graph() G.load('https://www.wikidata.org/wiki/Special:EntityData/Q42.rdf')
produces the error
rdflib.exceptions.ParserError: https://www.wikidata.org/wiki/Special:EntityData/Q42.rdf:5125:2: rdf:nodeID value is not a valid NCName: 3d66a9a972a16b3583effd41e5f2aff4
The RDF specification states that a nodeID should have type rdf-id, rdf-id is equivalent to NCName, and NCNames cannot start with numbers.
Example:
import rdflib rdflib.Graph().load('https://www.wikidata.org/wiki/Special:EntityData/Q42.rdf?revision=1283437880')
Acceptance criteria:
- Wikidata's RDF output is valid
Notes:
- coordinate this change with Query Service team
- It seems that prepending a letter should fix this issue. See also Lucas' comment.