Page MenuHomePhabricator

Switch geoline/geoshape to new schema
Closed, ResolvedPublic

Event Timeline

The relevant code is https://github.com/kartotherian/geoshapes/blob/master/geoshapes.js#L62-L71

There are two complications which make it more than just changing table names

  1. The current code matches wikidata IDs as text, including the leading character. The new schema stores numbers instead. This is an easy fix.
  1. To test this involves a maze of npm modules. I can write the code and even run it, but I'm not sure how we test it in a more production-like setting.
debt triaged this task as Medium priority.Sep 18 2017, 8:44 PM
debt added projects: Maps, Discovery-ARCHIVED.
debt added a subscriber: Gehel.

Trying to run kartotherian locally and running into issues with geoline.

  • Geoline is basically undocumented
  • When I issue a request for /geoline?getgeojson=1&ids=Q649 I get a 404 error back. The same request works on production
  • If this was a postgres issue, I'd expect a postgres error

Long-term I'm concerned about us running an entirely custom service (geoshapes) we know very little about and have lacking documentation for.

Looking at the difference between maps-test2004 and what you get after installing

  • /etc/kartotherian/config.yaml has a section on geoshapes none of the documentation yaml files have.
  • The ultra-quick start documentation makes reference to removing "kartotherian-geoshapes" from the requestHandlers section of package.json. I didn't do this because it wasn't present in the file. In fact, the only reference to it is as kartotherian/geoshapes in dependencies. On maps-test, there are two references, one as kartotherian-geoshapes in dependencies, the other as kartotherian-geoshapes in requestHandlers

Looking at https://github.com/kartotherian/kartotherian/commit/3f2b27d90450b8551441c8e28dd9e27dc568415d#diff-b9cfc7f2cdf78a7f4b91a753d10865a2, @Yurik removed geoshapes from Kartotherian in March, but we don't have that change locally.

It's been a long time since I've touched this, and need to again.

Since I last wrote, the packaging and configuration situation has been cleaned up.

@kartotherian/geoshapes encodes style-specific logic in its queries, and this logic won't work in ClearTables.

The current query filters by WHERE tags ? 'wikidata' AND tags->'wikidata' IN ('Q123', ...)

The tables from ClearTables.wikidata are better, they have a numeric type so it's WHERE wikidata IN (123, ..., but this is different.

There are two ways to make the two work together: Change @kartotherian/geoshapes, or put in a VIEW to act as a shim. The former is what we want eventually, but I'm not sure how to best get there. If I change the code, we then have to use different versions of @kartotherian/geoshapes on current and new schema servers. For this reason I'm inclined towards the shim view.

i have solved this a bit differently with Sophox's region service (github). I think this approach is much better because it decouples tile generation from the regions service, and because the database is much smaller and can be regenerated and updated much more frequently/rapidly.

Basic idea: There is a osm2pgsql lua script that filers out anything that has no wikidata tag. It creates a tiny planet_osm_polygon table with wikidata key and geometry (similar for lines). The only large table is planet_osm_ways - needed to generate polygons because it doesn't know ahead of time which relations have wikidata tag.. Also note that the actual queries have been improved to avoid creating holes between polygons.

# Initial import
 osm2pgsql --create --slim --database gis4 --flat-nodes nodes.bin -C 26000 --number-processes 8 --hstore --style wikidata.style  --tag-transform-script wikidata.lua planet-latest.osm.pbf

# Minute updates
osmosis --read-replication-interval workingDirectory=workdir2 --simplify-change  --write-xml-change - | osm2pgsql --append --slim --database gis --flat-nodes nodes.bin --number-processes 8 --hstore --style wikidata.style  --tag-transform-script wikidata.lua -r xml -

Resulting tables with sizes:

gis4=# \dt+
                           List of relations
 Schema |        Name        | Type  | Owner |    Size    | Description 
--------+--------------------+-------+-------+------------+-------------
 public | planet_osm_line    | table | yuri  | 5655 MB    | 
 public | planet_osm_nodes   | table | yuri  | 0 bytes    | 
 public | planet_osm_point   | table | yuri  | 199 MB     | 
 public | planet_osm_polygon | table | yuri  | 4919 MB    | 
 public | planet_osm_rels    | table | yuri  | 2765 MB    | 
 public | planet_osm_roads   | table | yuri  | 8192 bytes | 
 public | planet_osm_ways    | table | yuri  | 111 GB     | 
 public | spatial_ref_sys    | table | yuri  | 4616 kB    | 
(8 rows)

For this reason I'm inclined towards the shim view.

I decided to go with this, and add the shim SQL to the wikidata cleartables version

I verified this works, but it's blocked on the other issues with maps-test2004