Page MenuHomePhabricator

GeoPoint feature can't handle "no-value" and "some-value" properties
Closed, ResolvedPublic

Description

Originally reported here: T188291#8279218

The relevant code can be seen in https://github.com/nyurik/wd-type-parser/blob/master/src/wikidataTypeParser.js#L49. It returns undefined. This currently ends in the final GeoJSON object as { coordinates: undefined }, which will be stringyfied to "geometry": { "type": "Point" }, which is not valid GeoJSON. All this happens after the sanitize-mapdata API was called.

Example page with this issue: https://www.wikidata.org/w/index.php?title=User:Ayack/Draft&oldid=1741989429

Event Timeline

Change 838124 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/services/kartotherian@master] Ignore SPARQL results with some-value/no-value

https://gerrit.wikimedia.org/r/838124

Change 838138 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/services/kartotherian@master] Rename some variables in geoshape service for readability

https://gerrit.wikimedia.org/r/838138

Change 838157 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/services/kartotherian@master] Remove unused `delete` from geoshapes service

https://gerrit.wikimedia.org/r/838157

Change 838157 merged by jenkins-bot:

[mediawiki/services/kartotherian@master] Remove unused `delete` from geoshapes service

https://gerrit.wikimedia.org/r/838157

Change 838138 merged by jenkins-bot:

[mediawiki/services/kartotherian@master] Rename some variables in geoshape service for readability

https://gerrit.wikimedia.org/r/838138

Change 838124 merged by jenkins-bot:

[mediawiki/services/kartotherian@master] Ignore SPARQL results with some-value/no-value

https://gerrit.wikimedia.org/r/838124

This is technically done but waiting for the next (manual) deployment of Maps (Kartotherian). Can be closed when the example page works.

I tried to test the fix on the beta cluster where it should be deployed now. I used the query from the example page.

https://en.wikipedia.beta.wmflabs.org/w/index.php?title=Maptests/T319283

The static image is broken and when I try to retrieved directly I get the following error message:

"geojson_datasource: Failed parse GeoJSON file from in-memory string  encountered during parsing of layer 'layer' in Layer at line 22"

FWIW

That's the query result from map server perspective:

https://maps-beta.wmflabs.org/geopoint?getgeojson=1&query=SELECT+%3Fid+%3Fgeo+WHERE+{+%3Fid+wdt%3AP625+%3Fgeo%3B+(wdt%3AP131%2Fwdt%3AP131)+wd%3AQ12620%3B+(wdt%3AP31%2F(wdt%3AP279*))+wd%3AQ16970.+}

That's the JS error on the console:

Error: Invalid LatLng object: (t, h)
    LatLng https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:56
    coordsToLatLng https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:155
    geometryToLayer https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:154
    _initialize https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:233
    _initialize https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:233
    setGeoJSON https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:232
    initialize https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:232
    NewClass https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:44
    featureLayer https://en.wikipedia.beta.wmflabs.org/w/load.php?lang=en&modules=mapbox&skin=vector-2022&version=3l3a6:233
    addGeoJSONLayer https://en.wikipedia.beta.wmflabs.org/wiki/Maptests/T319283#/map/0 line 10 > injectedScript:12
    addDataGroups https://en.wikipedia.beta.wmflabs.org/wiki/Maptests/T319283#/map/0 line 10 > injectedScript:12
    addDataGroups https://en.wikipedia.beta.wmflabs.org/wiki/Maptests/T319283#/map/0 line 10 > injectedScript:12

The raw API request for this is https://en.wikipedia.beta.wmflabs.org/wiki/Special:ApiSandbox#action=query&prop=mapdata&titles=Maptests%2FT319283&mpdgroups=_010eee0e48d74d313740293a44e9b68c4548f8ab. The JSON is valid. The GeoJSON string can be extracted and parsed as well. But it's not valid GeoJSON. It contains weird things like "coordinates": "http://www.wikidata.org/.well-known/genid/51370059e35df6afff396e90f00a7294" in several places.

The SPARQL query is https://query.wikidata.org/#SELECT%20%3Fid%20%3Fgeo%20WHERE%20%7B%20%3Fid%20wdt%3AP625%20%3Fgeo%3B%20%28wdt%3AP131%2Fwdt%3AP131%29%20wd%3AQ12620%3B%20%28wdt%3AP31%2F%28wdt%3AP279%2a%29%29%20wd%3AQ16970.%20%7D. The result does indeed contain these weird broken coordinates.

Following the links to e.g. https://www.wikidata.org/wiki/Q3585894#P625 shows these are "unknown" values.

Here is the code that generates these weird genids: https://codesearch.wmcloud.org/search/?q=generateWellKnownURI.

I think we need to filter this the same way we filter "no value".

Change 879821 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/services/kartotherian@master] Filter "unknown" Wikidata coordinates represented as URI string

https://gerrit.wikimedia.org/r/879821

Change 879821 merged by jenkins-bot:

[mediawiki/services/kartotherian@master] Filter "unknown" Wikidata coordinates represented as URI string

https://gerrit.wikimedia.org/r/879821

Change 881358 had a related patch set uploaded (by Thiemo Kreuz (WMDE); author: Thiemo Kreuz (WMDE)):

[mediawiki/services/kartotherian@master] Filter points without coordinates as early as possible

https://gerrit.wikimedia.org/r/881358

Change 881358 merged by jenkins-bot:

[mediawiki/services/kartotherian@master] Filter points without coordinates as early as possible

https://gerrit.wikimedia.org/r/881358