Page MenuHomePhabricator

Coordinates are exported into RDF with excessive precision
Closed, ResolvedPublic

Description

When coordinates are exported into RDF, they are represented with many more digits than the precision allows. I.e., coordinate for https://www.wikidata.org/wiki/Q116746, with precision specified as "arcseconds", or 31m, are exported as Point(13.366666666667 41.766666666667) - 12 digits, or sub-millimeter precision. It should be exported as Point(13.3667 41.7667) instead.

Event Timeline

Restricted Application added a project: Discovery. · View Herald TranscriptAug 29 2017, 9:22 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
thiemowmde triaged this task as Low priority.Dec 6 2017, 11:41 PM
thiemowmde added a subscriber: thiemowmde.

I can see that this situation tends to be confusing and could need some improvement, especially UX-wise. But this is not an issue specific to the RDF export or the Wikidata-Query-Service. These numbers are just how the coordinates are stored internally. And I don't think we can or even should change anything about this. Most of the coordinates are submitted via the API. If the submitted coordinate just was 13.366666666667, why should we truncate that?

Mostly because most of these digits are not representing any real data, it's just junk produced by decimal representation with overly big precision and produced by various conversions and calculations. We're just dragging around those meaningless characters that do not have any use and do not represent any data (nobody really measured that coordinate with micron precision and got 13.366666666667, what happened most probably that it was measured in another system, then calculation involved 40.1/3 (probably when converting degrees and minutes to decimal) and the result came out as 13.366666666667. And then we convert back, we'd get 40.100000000001 - again, junk data in 11 last decimal places.

I can follow all your arguments. It's just that I think the effect of this (actually well defined) behavior on users is really, really negligible. Most users are never going to see coordinates as numbers anyway, but as dots or shapes on maps.

And even if, which user will think of sub-millimeters when they see a representation like 13.366666666667? Especially when the object is a city, or any larger shape. Most users don't even know what 1 degree is in meters or miles.

That said, I agree this could be improved, and even have an actual suggestion I want to implement some day, either in the RDF export or somewhere deeper in the Wikibase code base: Basically, cut off decimal places that do not have any effect on any of the output formats we support. This algorithm should consider all output formats, because when such an algorithm is applied we don't know which output format will be used.

But this idea requires coordinates to be stored as strings, which they are not. Basically, this requires a new datatype.

Lydia_Pintscher moved this task from incoming to monitoring on the Wikidata board.Mar 5 2018, 4:15 PM

Now this bug has its own xkcd: https://xkcd.com/2170/

Smalyshev moved this task from Backlog to Next on the User-Smalyshev board.Jul 10 2019, 10:39 PM

Change 521984 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/Wikibase@master] Format coordinates with limited precision

https://gerrit.wikimedia.org/r/521984

Smalyshev moved this task from Next to In review on the User-Smalyshev board.Jul 11 2019, 6:46 AM

Change 521984 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Format coordinates with limited precision

https://gerrit.wikimedia.org/r/521984

Smalyshev moved this task from In review to Done on the User-Smalyshev board.
Smalyshev closed this task as Resolved.Thu, Aug 15, 5:50 AM