Page MenuHomePhabricator

Allow download of Wikidata query results in GPS-friendly format(s)
Open, Needs TriagePublic

Description

When a Wikidata Query Service query results in data that can be mapped, the user should be able to download that data in one or more format(s) suitable for import into mapping tools; such as GPS Exchange Format (GPX), Keyhole Markup Language (KML), GeoJSON, etc.

Current proposed patch supporting GPX, GeoJSON and KML:

https://gerrit.wikimedia.org/r/#/c/wikidata/query/gui/+/516662/

Test on this live platform:

https://pebbie.org/wdqs/#%23Map%20of%20hospitals%0A%23added%202017-08%0A%23defaultView%3AMap%0ASELECT%20DISTINCT%20%2a%20WHERE%20%7B%0A%20%20%3Fitem%20wdt%3AP31%2Fwdt%3AP279%2a%20wd%3AQ16917%3B%0A%20%20%20%20%20%20%20%20wdt%3AP625%20%3Fgeo%20.%0A%7D%0ALIMIT%2010

How it looks like (see last menu entries):

Screenshot KML GeoJSON GPX download on WDQS.png (556×429 px, 73 KB)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

It would also be really nice if this could export to either GeoJSON or TopoJSON, because it makes it a lot easier to integrate query results in visualisation tools. For example, Vega-lite maps.

I have implemented for GeoJSON, GPX, and KML. here's the snippet. four small npm libraries are used : wicket (parsing WKT), geojson, togpx, tokml

	/**
	 * Get the result of the submitted query as GeoJSON
	 *
	 * @return {object}
	 */
	SELF.prototype._getResultAsGeoJson = function() {
		var output = [],
			data = this._rawData;
		var wkt = new Wkt.Wkt();
		output = this._processData( data, function( row, out ) {
			var newRow = {};
			for ( var rowVar in row ) {
				var binding = ( row[rowVar] || {} );
				if ( binding.type === 'literal' && binding.datatype && binding.datatype === 'http://www.opengis.net/ont/geosparql#wktLiteral' ) {
					wkt.read( binding.value );
					newRow._lat = wkt.components[0].y;
					newRow._lng = wkt.components[0].x;
				} else { 
					newRow[rowVar] = binding.value;
 				}
			}
			out.push( newRow );
			return out;
		}, output );
		return GeoJSON.parse( output, { Point: ['_lat', '_lng'] } );
	};

	/**
	 * Get the result of the submitted query as GeoJSON
	 *
	 * @return {string}
	 */
	SELF.prototype.getResultAsGeoJson = function() {
		return JSON.stringify( this._getResultAsGeoJson() );
	};

	/**
	 * Get the result of the submitted query as GPX
	 *
	 * @return {string}
	 */
	SELF.prototype.getResultAsGPX = function() {
		var gj = this._getResultAsGeoJson();
		return togpx( gj );
	};

	/**
	 * Get the result of the submitted query as KML
	 *
	 * @return {string}
	 */
	SELF.prototype.getResultAsKML = function() {
		var gj = this._getResultAsGeoJson();
		return tokml( gj );
	};

any ideas on how to enable/unhide download menu item only when the query result containing geolocation? should i make separate task on the board?

What needs to be done for this to be be deployed to the live Query Service? When might that happen?

I have implemented for GeoJSON, GPX, and KML. here's the snippet. four small npm libraries are used : wicket (parsing WKT), geojson, togpx, tokml

For those interested: @Peb’s implementation is live at https://pebbie.org/wdqs/ (example query: http://tinyurl.com/yyyo3gn5 )
The unmerged patcheset lives at https://gerrit.wikimedia.org/r/#/c/wikidata/query/gui/+/516662/

What needs to be done for this to be be deployed to the live Query Service?

I have some feedback from the Wikidata team:

the person who submitted a code snipped needs to turn it into a proper patch in Gerrit so it can be reviewed and merged and then deployed. (I don't know how much additional work is left on top of the code snippet they pasted.)

What needs to be done for this to be be deployed to the live Query Service?

I have some feedback from the Wikidata team:

the person who submitted a code snipped needs to turn it into a proper patch in Gerrit so it can be reviewed and merged and then deployed. (I don't know how much additional work is left on top of the code snippet they pasted.)

@Pigsonthewing Please see my comment just above:

@Pigsonthewing Please see my comment just above:

The unmerged patcheset lives at https://gerrit.wikimedia.org/r/#/c/wikidata/query/gui/+/516662/

This is beyond my skill set (I'm just the messenger!), but I note that that has a red label: "Cannot Merge".

@Lydia_Pintscher Please can you kindly delegate to someone who can move this forward, or at least tell us what is needed to do so?

Gehel subscribed.

This was discussed by the Search Platform team. Brief summary:

Some of the download formats are managed by Blazegraph directly, via content negotiation. This is definitely not something we would like to extend as it would increase our reliance on Blazegraph. Having this implemented as part of WDQS-UI seems reasonably fine. Even better would be a dedicated service which could be reused on other SPARQL endpoint.

Task Breakdown Notes:

  • We are not sure if this functionality should be recreated in case the patch cannot be applied, or if this should go back to the backlog? @Lydia_Pintscher @Arian_Bozorg
  • We can try to apply the patch by rebasing it over the latest HEAD, and test it both through CI and locally
  • In case there are any comments on the patch, we assume that one of us will need to apply changes requested.

Hey everyone,

We looked into the existing patch more closely and it currently pulls in 4 new libraries that would all have to go through security review at the WMF. That's gonna take a lot of time and effort to make it happen unfortunately. One step I could see is that we reduce the patch to one format and do the dance with 1 or max 2 libraries, which would be more manageable and warranted by the importance of the task. If I understand it correctly the patch builds GeoJSON and then derives the other formats from it. Would providing only GeoJSON be a reasonable thing or is that useless for you?

Would providing only GeoJSON be a reasonable thing[...] ?

One format would certainly be better than none; but we really need to be providing alternative formats. My original request , when I opened this ticket[/*], was for "format(s) suitable for import into mapping tools; such as GPS Exchange Format (GPX), Keyhole Markup Language (KML), etc"

GeoJSON might be good for coders, but it's of little use to people like me who just want to see the output of a query in their preferred mapping tool, or on a hand-held GPS device.

/* Four years ago this coming Monday - do we get cake?

Is one of those other formats significantly more useful/better/...?

Is one of those other formats significantly more useful/better/...?

Personally, KML, but I have done no user-research with the wider user community.

[By way of illustration, the question is like saying "which is better: JPEG, TIFF or raw?" - it all depends on the use case.]

We looked into the existing patch more closely and it currently pulls in 4 new libraries that would all have to go through security review at the WMF.

And then a lot more dependants...

Just what the new packages bring in:

Screenshot 2023-02-15 at 22.47.55.png (1×1 px, 255 KB)

Before:

Screenshot 2023-02-15 at 22.51.32.png (1×1 px, 186 KB)

After:

Screenshot 2023-02-15 at 22.51.41.png (1×836 px, 241 KB)

@Reedy what tool did you use to create these diagrams, I'd be interested in it for dependency analysis in other projects as well.

@Reedy what tool did you use to create these diagrams, I'd be interested in it for dependency analysis in other projects as well.

First result on google ;)

https://npmgraph.js.org/

Ah it's that online tool... Thought you might have something offline or at an editor or repository level

@Lydia_Pintscher Is there a method for exporting in the various formats which does not require 4 new libraries that would all have to go through security review at the WMF taking a lot of time and effort to make it happen?

WDQS can already export in JSON. It is counterintuitive that exporting in geoJSON, KML &c - seemingly straightforward formats - is so hugely difficult that it should not be attempted.