Page MenuHomePhabricator

wikibase:GlobecoordinateValue decimal representation not in lexical form in WDQS.
Closed, ResolvedPublic


It seems that using shorthand rather than a lexical form for decimal coordinates breaks (xsd schema) validation of the munged/split wikibase turtle dumps. Example:

wdv:d0a7604c8ae9777857887ac4f1807286 a wikibase:GlobecoordinateValue ;
	wikibase:geoLatitude 30.12684 ;
	wikibase:geoLongitude 120.25657 ;
	a wikibase:GeoAutoPrecision ;
	wikibase:geoPrecision 0.00027777777777778 ;
	wikibase:geoGlobe wd:Q2 .

This is a problem for loading this data into Virtuoso, and possibly other triple stores. The geodata decimals are serialized in lexical form if requested directly from wikibase, however.

Event Timeline

Restricted Application added projects: Wikidata, Discovery. · View Herald TranscriptMar 30 2016, 10:07 AM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The PRETTY_PRINT setting of the TurtleWriter is set to "true" by default. This causes the writer to only write the literal "label" without the datatype. This affects boolean, decimal, integer and double literals.

To fix make the following change (starting at line 623) in

final RDFWriter writer = Rio.createWriter(RDFFormat.TURTLE, lastWriter);
final WriterConfig config = writer.getWriterConfig();
config.set(BasicWriterSettings.PRETTY_PRINT, false);
handler = new PrefixRecordingRdfHandler(writer, prefixes);

Other default config settings are:

config.set(BasicWriterSettings.RDF_LANGSTRING_TO_LANG_LITERAL, true);
config.set(BasicWriterSettings.XSD_STRING_TO_PLAIN_LITERAL, true);
Lydia_Pintscher moved this task from incoming to hold on the Wikidata board.Apr 3 2016, 11:37 AM

Change 284372 had a related patch set uploaded (by Smalyshev):
Set pretty printing to false for RDF writer

Restricted Application added a subscriber: TerraCodes. · View Herald TranscriptApr 19 2016, 10:53 PM

Change 284372 merged by jenkins-bot:
Set pretty printing to false for RDF writer

Smalyshev closed this task as Resolved.May 1 2016, 9:46 PM
Smalyshev triaged this task as Medium priority.