Discussion with @Pnorman: it seems that a good starting point for number of threads to use in osm2pgsql is nb CPU/2. The number of connections to postgres is nb threads x nb tables, which will need to be adapted from our current 120 max connections, taking into account the tilerator traffic.
|operations/puppet : production||WIP - Tune thread for osm2pgsql / postgres max connections for Maps|
|Declined||None||T137616 Epic: cultivating the Maps garden|
|Open||None||T137229 Tune thread for osm2pgsql / postgres max connections for Maps|
I suspect that Tilerator will have one connection per worker. Eventually, I would also like to have Kartotherian to use Postgres directly to get some data, so that number will tripple ( tilerator's cpucount/2 + kartotherian's cpucount).
I would expect the number of threads and the number of worker to have no direct relation to each other. Especially in node where IO should be async...
I had a quick look in the code and it seems that we are using pg.js, that seems to have an embbeded connection pool (node.js is really not my cup of tea yet). I'm not entirely sure how it does (or does not) make sense to pool DB connections.
We need measures... as always...
For import I generally recommend osm2pgsql uses num CPU threads on machines with up to 8 threads, unless there's something else running at the same time. Past 8 threads there's little data available. If you have enough RAM and are doing --slim import without --drop, most of the time is spent on building a large index, which can't be parallized.
For updates, it's a tradeoff for update speed vs load on the system. Lots of people run single threaded to keep the load down, or with just 2 threads.