Background information
OSM replication is failing repeatedly for a while and introducing buggy behavior regarding polygon rendering.
The OSM replication system relies on osm2pgsql, which is a widely and robust tool used by a big community for OSM data load. Despite that, osm2pgsql doesn't provide enough error logging support, and tracking the current issues are hard and sometimes inconclusive.
Another option is to use imposm3, another powerful tool that is also widely used by the OSM community. This spike task is an opportunity to identify if imposm3 features can meet production standards for our Maps infrastructure.
Hypothesis
Imposm3 can offer better support for all maps infrastructure and can replace osm2pgsql as the OSM replication engine.
Questions we want to answer
- Imposm3 have a better logging output to help with maintenance work?
- Imposm3 can handle OSM replication as fast as osm2pgsql?
- Imposm3 can be deployed in our infrastructure?
- Can we move to imposm3 with minimum changes in Postgres changes?
- Will it fixes the issues with OSM replication?
How will we go about answering the questions
- Investigate schema changes and make sure there are no style changes during tile rendering.
- Deploy do beta cluster and test full planet OSM replication
- Depool one machine in codfw (less traffic) and test the new changes in the production environment
Results
The investigation reached the conclusion that it's possible to migrate to imposm3 without having big changes, if any, on DB schemas and style. To accomplish that was necessary a proper imposm3 mapping and some changes on the vector tile queries, the work can be found in the following links:
- imposm3 config files
- SQL changes in osm-bright-source
- Docker-compose setup used to perform the investigation
To proceed with this migration though we need to follow-up on the following tasks: