Page MenuHomePhabricator

Update coastline data in OSM postgres db (osmdb.eqiad.wmnet)
Closed, ResolvedPublic

Description

The gis database containing the OSM "render" database contains an extra table with coastline data. This table is in need of an update (see also https://github.com/dschwen/wikiminiatlas/issues/19)

I'll paste a bit of background info (from an email from @Apmon) into the next comment.

I suggest renaming the current coastline table, doing a fresh import into a new table names 'coastlines' and testing if it works. I think my tool will be the only one affected (possibly the labs osm tile servers and I do not know which server the production stuff uses...). Maybe importing into a separate DB and switching once teh import is complete is teh better solution.

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald Transcript

We are currently importing OSM data without the -K|--keep-coastlines switch. I.e. the main tables do NOT contain any coastline data. Instead we are using postprocessed coastline data in a special table. HOWEVER this data is not automatically updated, and probably hasn't been updated in about two years at all!

Following is an excerpt from an email from Kai Krueger (Apmon):
(I think this is of a technical documentation nature so that publishing this excerpt should not be problematic)

If I remember correctly, last time we did a slightly different process
to import those coastlines than to use osm2pgsql.

The coastlines in OSM consist of thousands of individual "ways" and they
are quite difficult to work with in their raw format. It is hard to say
what is and isn't land or sea, as they are linear features and don't
form closed polygons.

Using the -K feature of osm2pgsql would only give you the raw linear
coastlines.

For this reason the typical (mapnik based) map rendering process doesn't
use the coastlines from the postgres database (which is the reason they
aren't imported by default). Instead, there is an extra "coast line"
process periodically extracting the raw coastlines from the planet file
which then cleans them up, tries to correct any minor problems using
heuristics and turns them into more manageable polygons. This process
can take several hours it self.

I think the current source for those processed coast line shape files
are at http://openstreetmapdata.com/data/coastlines .

On the toolserver we then imported those shape files using the utility
shp2pgsql into an extra set of tables. This coast line extraction
process used to run fairly seldomly and thus we didn't bother to
regularly update this data on the toolserver. But with improvements in
the performance of the extraction process, it is now run daily and it
might be worth thinking about how to keep this data up-to-date in the
labs database as well. At the moment I am only aware of updating that
data by downloading a complete new set of coastline shapefiles (~400MB),
throwing away the respective current postgis coastline tables and
re-importing them a fresh with shp2pgsql.

Given the process of generating those coastlines has changed since we
last imported them into toolserver, I don't know if the resulting
polygons still behave the same. I know the mapnik stylesheets had to be
adapted somewhat to support the new process, but I am not familiar with
the details.

To verify things, I would suggest that we import the current set of
coast line data into a test database, (given it is only 400Mb, doing so
on a labs instance should be fine), check if the results are
sufficiently similar to what Daniel needs for his tools and if yes,
import them into the main osm postgis database you have set up.
Otherwise, we need to figure out what to do. Whether to use a different
import process, and or how to adapt Daniels tools to work with the new
format of data.

just some technical notes:

osmdb.eqiad.wmnet is an alias for labsdb1006.eqiad.wmnet

cheat sheet for shp2pgsql http://www.bostongis.com/pgsql2shp_shp2pgsql_quickguide_20.bqg

Usage example :

  1. shp2pgsql -c -D -s 4269 -I shaperoads.shp myschema.roadstable > roads.sql
  2. psql -d roadsdb -f roads.sql

from http://stackoverflow.com/questions/6665572/converting-shapefiles-using-shp2pgsql

akosiaris claimed this task.
akosiaris subscribed.

So, I 've opted for using the already present puppet code to do the update. That required downloading manually the shapefile from http://openstreetmapdata.com/data/coastlines and renaming the old coastlines table and have puppet run and import the new table. This is the old process (https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/osm/manifests/shapefile_import.pp) which had the 2 manual steps given the very seldom need for updating. Since then we have got better and evolved into a more automated process in https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/osm/templates/import_waterlines.erb

Note the latter is for water_polygons but we can easily adapt it to be used for both cases. I 'll have a look into it in a different task. That being said, I think this is resolved.

Uuuuaaahhhh, now I'm getting ERROR: permission denied for relation coastlines

Uuuuaaahhhh, now I'm getting ERROR: permission denied for relation coastlines

Argh, my mistake. Sorry. I forgot the GRANTS required after destroying and recreating a table. Thankfully the newer approach mentioned above does not have this shortcoming. Mind trying again ?

Yes! Many thanks!