Page MenuHomePhabricator

geoPrecision exported to RDF as decimal, but is in fact float
Closed, ResolvedPublic

Description

The geoPrecision value of GlobeCoordinateValue is exported to RDF as decimal, but it is in fact stored and parsed as PHP float (which matches RDF double type). We should export it as double especially as RDF does not allow scientific notation in decimals.

We should also check for other float values typed as decimal because the format and precision expectations are different for those.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
thiemowmde moved this task from incoming to needs discussion or investigation on the Wikidata board.

On geo coordinates, all latitude, longitude, as well as the precision can be integers as well as floats internally. We believe this was a mistake when this datatype was designed, because floats aren't entirely stable across languages and platforms. We already run into issues with fluctuating hash calculations on coordinate values. Quantities, on the other hand, use strings internally.

Could it be that the mapping to decimal was done intentional to "hide" this mistake?

If it turns out this causes issues it should be fixed, of course.

On coordinates, see T174504. Precision, however, is different. Right now we generate invalid RDF due to mixup between decimal and float. We could fix it by generating proper decimals, but since we store floats and not strings internally, that would generate inconsistent dumps. OTOH, in general precision numbers we generate now make little sense - e.g we have precisions of 0.00010057698125102 and 0.00010057719731271 and 0.00010059051031334 - I have super hard time believing anybody can differentiate between those and make sense out of them. But this is a separate question from us generating invalid RDF - whatever we use, we should at least keep RDF valid.
So to make legal RDF we could either do "+0.00010057698125102"^^xsd:decimal or "1.0057698125102e-4"^^xsd:float. Both would work (right now we're doing "1.0057698125102e-4"^^xsd:decimal which is invalid). For the former, we'd need to import float into DecimalValue and then print it. Technically possible, but I am not sure whether it is a good way to go. I'd rather make some order into these precision values before we deal with more refined questions like moving them to decimals.

Smalyshev raised the priority of this task from Low to Medium.Nov 1 2017, 12:06 AM

I disagree with "low" priority - producing invalid RDF is bad.

Change 387754 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikibase@master] Type GlobeCoordinate values as floats which they are

https://gerrit.wikimedia.org/r/387754

thiemowmde added a subscriber: Jonas.

I approved the patch and will close this ticket for now, assuming this is all that needs to be done. If there is anything else then please reopen this ticket, @Smalyshev. Thanks.

Change 387754 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Type GlobeCoordinate values as floats which they are

https://gerrit.wikimedia.org/r/387754