Page MenuHomePhabricator

Create "loading" schema for loading external data
Closed, ResolvedPublic

Description

The script to load external data needs a schema to work in "behind the scenes" before it swaps data into the public schema. It tries to create the schema on its own, but fails.

I don't know if it's best to give the user permission to do this, or manage the schemas with puppet.

For clarity, this issue is about schema in the sense of a schema defined by the SQL standard, not schema as in the set of tables, functions, and other objects that model the data.

Event Timeline

Pnorman created this task.Mar 21 2018, 12:22 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 21 2018, 12:22 AM
Pnorman updated the task description. (Show Details)Mar 21 2018, 11:01 PM

@RobH would you happen to know who could help with this while Guillaume is out?

Pnorman moved this task from Backlog to Stalled/Waiting on the Maps-Sprint board.Mar 22 2018, 5:09 PM

Maybe I can help, but I 'll need a bit more information as to what the problems is. Which tool fails, with what invocation and what is the error?

FYI, We do avoid having schemas (of any form) in puppet since this is more application specific and less configuration management.

Pnorman added a subscriber: Gehel.Mar 28 2018, 9:50 PM

Maybe I can help, but I 'll need a bit more information as to what the problems is. Which tool fails, with what invocation and what is the error?

/srv/deployment/tilerator/deploy/node_modules/@kartotherian/meddo/get-external-data.py -c /srv/deployment/tilerator/deploy/node_modules/@kartotherian/meddo/external-data.yml

Note that PGHOST, PGUSER, etc will need to be set to run that.

It fails with the error psycopg2.ProgrammingError: permission denied for database ct

You can get the same with psql -d ct -U tilerator -c 'CREATE SCHEMA loading;'

FYI, We do avoid having schemas (of any form) in puppet since this is more application specific and less configuration management.

We have the public schema set up through puppet, as well as the database names and extensions.

If we're not setting it up through puppet, the easiest option is to run sudo -u postgres psql -d ct -c 'CREATE SCHEMA loading AUTHORIZATION tilerator;'


Possibly we should be doing this as osmimporter, have them own the DB, and set the permissions more specifically, but that can be sorted out when @Gehel gets back.

Maybe I can help, but I 'll need a bit more information as to what the problems is. Which tool fails, with what invocation and what is the error?

/srv/deployment/tilerator/deploy/node_modules/@kartotherian/meddo/get-external-data.py -c /srv/deployment/tilerator/deploy/node_modules/@kartotherian/meddo/external-data.yml
Note that PGHOST, PGUSER, etc will need to be set to run that.
It fails with the error psycopg2.ProgrammingError: permission denied for database ct

I don't even see a database ct on any of maps100X, maps200X or maps-test200X clusters. I am thinking the database should be created before we even delve into this task ? When is @Gehel coming back ? I 'd rather refer back to him on this one after all, I now feel I am missing crucial context.

You can get the same with psql -d ct -U tilerator -c 'CREATE SCHEMA loading;'

FYI, We do avoid having schemas (of any form) in puppet since this is more application specific and less configuration management.

We have the public schema set up through puppet, as well as the database names and extensions.

I am guessing that as far as the schema goes you are referring to https://phabricator.wikimedia.org/source/operations-puppet/browse/production/modules/profile/templates/maps/grants-db.sql.erb and siblings, which are about the permissions on the public schema. Fair enough, but FYI these are not enforced at runtime by puppet but are there rather for informational/documentation purposes in order to facilitate manual provisioning by a human (it's my doing mostly hence the comment). As far as the db creation and extensions (and users IIRC) they are indeed managed by puppet (again I am the perpetrator of that and I have to admin that I doubt somewhat my own approach).

If we're not setting it up through puppet, the easiest option is to run sudo -u postgres psql -d ct -c 'CREATE SCHEMA loading AUTHORIZATION tilerator;'

Possibly we should be doing this as osmimporter, have them own the DB, and set the permissions more specifically, but that can be sorted out when @Gehel gets back.

I don't even see a database ct on any of maps100X, maps200X or maps-test200X clusters. I am thinking the database should be created before we even delve into this task ? When is @Gehel coming back ? I 'd rather refer back to him on this one after all, I now feel I am missing crucial context.

pnorman@maps-test2004:~$ psql -h localhost -U tilerator -l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
 ct        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 gis       | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres

I thought I had made it clear it was maps-test2004 which was the only one that needed this, but see that it was just in the IRC messages, which didn't get a responsible over multiple days.

I don't even see a database ct on any of maps100X, maps200X or maps-test200X clusters. I am thinking the database should be created before we even delve into this task ? When is @Gehel coming back ? I 'd rather refer back to him on this one after all, I now feel I am missing crucial context.

pnorman@maps-test2004:~$ psql -h localhost -U tilerator -l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
 ct        | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 gis       | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres

I thought I had made it clear it was maps-test2004 which was the only one that needed this, but see that it was just in the IRC messages, which didn't get a responsible over multiple days.

And I forgot to look at all the maps-testXXXX boxes and just looked at a couple instead. Sorry about that. Anyway I 've went the sudo -u postgres psql -d ct -c 'CREATE SCHEMA loading AUTHORIZATION tilerator;' way for now to unblock you. Feel free to discuss this more with @Gehel later on about the best approach.

Pnorman moved this task from Stalled/Waiting to Done on the Maps-Sprint board.Apr 24 2018, 1:59 AM

It's now documented, and for future loads it'll be part of the install instructions

Gehel closed this task as Resolved.Jun 12 2018, 4:06 PM
Gehel claimed this task.