Page MenuHomePhabricator

Set up replicated tile storage Postgres db
Closed, DeclinedPublic

Description

In order to experiment with Postgres tile storage, lets

  • setup a new Postgres database (name bikeshedding: "tiles")
  • setup master-slave replication 2001->200[234]
  • grant read access to kartotherian, read/write to tilerator, schema creation to tilerator or tileratorui(?)
  • decide if schema will be created as part of the sql script inside the kartotherian-postgres repo, or if it will be done by the code inside that repo (i think the code is already there, but needs testing)
  • regenerate tiles to the new postgresql storage
  • run load test on tiles served from postgresql and compare it to Cassandra

Optional:

  • Investigate if we should use Citus instead -- T130870

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Random question: with the postgresql replication as we have it in place, the slaves are readonly. So we need writes to be routed to the master and reads to any postgresql server. We don't have any abstraction in place to hide this complexity from the client application at the moment. And we don't have any mechanism in place to automate promoting a slave to master in case the master fails. Since user facing requests only need read, the current situation is probably sufficient.

Change 318954 had a related patch set uploaded (by Gehel):
maps - create postgresql database for tiles storage

https://gerrit.wikimedia.org/r/318954

@Gehel re your question (which is more of a statement) - yep. :)

Re stress testing - I dumped the list of tiles at stat1002 - ~/mapUrls2.txt

Gehel moved this task from To-do to In progress on the Maps-Sprint board.

Change 318954 merged by Gehel:
maps - create postgresql database for tiles storage

https://gerrit.wikimedia.org/r/318954

New configuration file created on maps-test2001 in /srv/deployment/tilerator/deploy/src/sources-postgres.prod.yaml with this additional source:

pgstore:
  uri: postgres://
  params:
    database: tiles
    username: {var: osmdb-user}
    password: {var: osmdb-pswd}
    createIfMissing: true

Creation of the database is done when tilerator is restarted:

tiles=# \d tiles
     Table "public.tiles"
 Column |   Type   | Modifiers 
--------+----------+-----------
 zoom   | smallint | not null
 idx    | bigint   | not null
 tile   | bytea    | 
Indexes:
    "tiles_pkey" PRIMARY KEY, btree (zoom, idx)

but no message in the logs about it (it would be nice!)

The restart of tileratorui throws an error :

[2016-11-04T11:54:30.583Z] ERROR: tileratorui/2 on maps-test2001: (err.code=42P07, err.isOperational=true, err.length=255, err.severity=ERROR, err.file=heap.c, err.line=1056, err.routine=heap_create_with_catalog, err.levelPath=error)
    Unable to load source "pgstore"
    relation "tiles" already exists
    --
    error: relation "tiles" already exists
        at Connection.parseE (/srv/deployment/tilerator/deploy-cache/revs/63fcca843cf77dc80c180bc771450a3231b35fda/node_modules/kartotherian-postgres/node_modules/pg-promise/node_modules/pg/lib/connection.js:539:11)
        at Connection.parseMessage (/srv/deployment/tilerator/deploy-cache/revs/63fcca843cf77dc80c180bc771450a3231b35fda/node_modules/kartotherian-postgres/node_modules/pg-promise/node_modules/pg/lib/connection.js:366:17)
        at Socket.<anonymous> (/srv/deployment/tilerator/deploy-cache/revs/63fcca843cf77dc80c180bc771450a3231b35fda/node_modules/kartotherian-postgres/node_modules/pg-promise/node_modules/pg/lib/connection.js:105:22)
        at emitOne (events.js:77:13)
        at Socket.emit (events.js:169:7)
        at readableAddChunk (_stream_readable.js:153:18)
        at Socket.Readable.push (_stream_readable.js:111:10)
        at TCP.onread (net.js:536:20)
    --
    err.cause: {
      "name": "error",
      "length": 255,
      "severity": "ERROR",
      "code": "42P07",
      "where": "SQL statement \"CREATE TABLE \"tiles\" (zoom smallint, idx bigint, tile bytea, PRIMARY KEY (zoom, idx))\"\nPL/pgSQL function inline_code_block line 1 at SQL statement",
      "file": "heap.c",
      "line": "1056",
      "routine": "heap_create_with_catalog",
      "moduleUri": "{\"database\":\"tiles\",\"username\":\"tileratorui\",\"createIfMissing\":true,\"table\":\"tiles\",\"minzoom\":0,\"maxzoom\":14}"
    }
    --
    err.where: SQL statement "CREATE TABLE "tiles" (zoom smallint, idx bigint, tile bytea, PRIMARY KEY (zoom, idx))"
    PL/pgSQL function inline_code_block line 1 at SQL statement
    --
    err.moduleUri: {"database":"tiles","username":"tileratorui","createIfMissing":true,"table":"tiles","minzoom":0,"maxzoom":14}

A warning or info would probably be more appropriate and this should not block the loading of the source (source seems to be disabled).

Configuration is now reverted, pending investigation.

The error is that table or relation "tiles" already exists. You could use a
different table name, or we can investigate further

Change 319893 had a related patch set uploaded (by Gehel):
Maps - tilerator on all maps servers needs access to postgresql master

https://gerrit.wikimedia.org/r/319893

Change 319893 merged by Gehel:
Maps - tilerator on all maps servers needs access to postgresql master

https://gerrit.wikimedia.org/r/319893

Change 320381 had a related patch set uploaded (by Gehel):
maps - fix resource dependencies for tiles database creation

https://gerrit.wikimedia.org/r/320381

Change 320381 merged by Gehel:
maps - fix resource dependencies for tiles database creation

https://gerrit.wikimedia.org/r/320381

debt subscribed.

We'd like to drop this task for now, as we don't have a full fledged team to do the work at this time.