Page MenuHomePhabricator

Kartographer should display a single node
Closed, ResolvedPublic

Description

With service=geoshape + ids=..., maplink and mapframe allow to display Wikidata items, but it does not work for a single node.
Kartographer should handle the 2 cases :

Event Timeline

Note : if the Wikidata item has both a P625 (coordinate location) property and is refered by an OSM object, we must define a priority rule.
To be discussed, but I think the following is sensible:
if the OSM object is a node, the Wikidata location is used (no need to use an external data for an equivalent level of detail). Otherwise, the OSM object is preferred (should be more detailed).

By a single node, I believe you mean displaying a marker (Point-Of-Interet) from a wikidata ID.
I agree this is a great feature.

I think there has been some discussion between @Yurik and @Smalyshev around caching and server capacity in order to enable this feature. Maybe they can talk about the technical implications here.

Pragmatically speaking, there's no need to use WDQS for this use case (get primary coordinates for a single WD id), Kartotherian can just request it from WD directly.

True, but in that case we don't support the fallback to OSM database.

I'm a bit unsure if there should be a node coordinate support in maps. Our main use case is to prevent significant data duplication, by reducing complex geometries (e.g. an outline of a country, city, or a river) to a single wikidata ID, or even better - to a SPARQL query that gets that ID. So we prevent duplication by getting it from OSM. So - OSM and .map datasets store geometries, while wikidata and .map datasets can store data points. Note that we already have a limitation - OSM can only provide geometries, not the associated tags like names, population, etc -- all that data can only come from wikidata. I think we should continue this split -- simple data comes from wiki sources (wikidata, .map datasets, or directly in <mapframe>/<maplink>, while complex geometries should not be duplicated, and should come from OSM if available.

Our main use case is to prevent significant data duplication

  • OSM and .map datasets store geometries
  • while wikidata and .map datasets can store data points

Makes sense to me. So let's not have a fallback.

reducing complex geometries (e.g. an outline of a country, city, or a river) to a single wikidata ID, or even better - to a SPARQL query

Now we need to figure this out. One option is querying WD directly, the other is using WDQS.

Yurik, thanks for your detailed response, but two remarks :

  1. to my knowledge, this does not handle my second point : for now, the wikidata items with just a P625 property are not displayed (with a marker). Am I wrong on this ?
  1. relating to your explanation "complex geometries in OSM, but simple ones only retrieved from Wikidata", I see 2 cases where we should perhaps reconsider this

a) In OSM, there is often progressively refined data : some objects start with a single node (either as "draft" contribution or because the data is too imprecise to draw the exact outline at first), but later evolves in a full outline (and maybe thanks to a different contributor). But if the wikidata= tag is present, it should be taken into account, to make OSM people use it, and to make them contribute "structured data" to Wikidata (instead of a bunch of vaguely defined tags, like operator, creator and so on)

b) other than that, I do not think to isolated node, but instead to complex objects which are constituted from smaller objects, possibly nodes.
For example, all hamlets from a single commune/municipality. I think it is ok to have a single point for a little hamlet, but the whole pack could be retrieved by a single WD query which could result in one or more Wikidata items.
Be it a Query returning multiple items or a single WD object, the result should be a cluster of points on the map. You can also think to "display all churches in a town", etc.

Other example : in OSM, a city wall could be constituted by a series of trunks (closed polygons), and by nodes marking the gates (with a name attribute). Either the whole set is grouped in a single relation in OSM, refering to a WD item, or they are individual objects all referring the same WD item. It should be good to see both polygons and node markers.

Sorry for my bad english, but I hope the 2 examples are clear enough to continue the discussion.

My advice about WDQS vs. WD API is: if you always know wikidata ID and need to retrieve data only for one ID or small (for sane definitions of "small") number of IDs, it's probably better to use Wikidata API. If you need to find Wikidata IDs by some other constraints, and may include non-trivial conditions, then using WDQS is warranted.

@Smalyshev the issue here is really about the location of coordinates. Commons' datasets, and mapframe/maplink tags may contain all tags, points, and shapes. Wikidata cannot contain shapes, but can contain the rest. OSM cannot contain anything that is outside of their scope (like historical features, zip code areas, animal migration paths, etc). So the question is - should Wikipedia allow point coordinates to be retrieved from OSM - in other words treat a single [longitude, latitude] coordinate pair as an object that can be referenced by a wikidata ID, or should that pair be stored in all the other places. The geoshapes cannot be stored in Wikidata, hence its natural to be stored everywhere else. In a way - should we normalize or denormalize point data? Shapes clearly should be normalized.

I think there are plenty of usecases for Wikipedia when the coordinates will never be allowed in OSM (think mostly historical data but also events in general) but will fit perfectly in a Wikidata object. It would be great to be able to get these with a query and display it in an article (e.g. all the battles in a war or similar).

I think there are plenty of usecases for Wikipedia when the coordinates will never be allowed in OSM (think mostly historical data but also events in general) but will fit perfectly in a Wikidata object. It would be great to be able to get these with a query and display it in an article (e.g. all the battles in a war or similar).

A few months ago we introduced storing datasets on Wikimedia Commons: https://www.mediawiki.org/wiki/Help:Map_Data, and its sibling https://www.mediawiki.org/wiki/Help:Tabular_Data, also known as "Commons Datasets".

There are quite a few examples available already: https://commons.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=486

I think this would solve your use case.

Even more generic vesion of this would be that we don't want just to show Wikidata items with coordinates on the map, but also SPARQL results with coordinates no matter what the source of the coordinates is. In some cases source is P625, but i think that valid case is that the coordinates are created or modified in the query.

I think there are plenty of usecases for Wikipedia when the coordinates will never be allowed in OSM (think mostly historical data but also events in general) but will fit perfectly in a Wikidata object. It would be great to be able to get these with a query and display it in an article (e.g. all the battles in a war or similar).

A few months ago we introduced storing datasets on Wikimedia Commons: https://www.mediawiki.org/wiki/Help:Map_Data, and its sibling https://www.mediawiki.org/wiki/Help:Tabular_Data, also known as "Commons Datasets".

There are quite a few examples available already: https://commons.wikimedia.org/wiki/Special:AllPages?from=&to=&namespace=486

I think this would solve your use case.

It would solve the historical data if I had the full dataset ready now (and it weren't under construction like everything on Wikipedia/Wikidata), but if not, or if the data is something thats updates regularly (like events), a live query against Wikidata would give the most recent/complete data rather than something that got stored once on Commons.

I agree with Zache and Ainali : handling P625 coordinates from a live query would fullfil the needs.
BTW it seems to be a more "natural" functionality in (dynamic) Kartographer than from (static) Commons Datasets.

It doesn't seem useful/wanted to me either to overcomplicate things by pulling/storing coordinates for an OSM node with wikidata tag, whereas linked Wikidata item is already expected to store more or less the same coordinates as P625 value. Maplink/mapframe marker at given Wikidata coordinates can be displayed and common Wikipedia templates/modules such as this one already do that.

As for the idea to query and display coordinates for *multiple* Wikidata items, I think it's more clearly outlined in another task: T188291#4266453. Hence I'd suggest declining this task.

WMDE-Fisch subscribed.

I guess this ticket is well enough solved with the implementation of T307695: Display coordinate markers in Kartographer maps from QID which uses P625 as source. Storing a complete copy of OSM node data is really out of scope and not an useful approach. But I guess the major intend of having a way to show single markers from an external source was fulfilled so I'll mark this as resolved rather then declined.