Page MenuHomePhabricator

Figure out where/how to store IDM internal data
Closed, ResolvedPublic

Description

Various data will be modified when operating/using the IDM:

  • The majority of settings managed by the IDM are stored within the LDAP directory (which is already redundantly stored and allows for multi master changes)
  • Some settings are specific to the setup of the IDM (e.g. configuring which attributes are required/optional e.g.). These will be modified via Puppet centrally
  • Some settings will likely only be stored within the IDM (e.g. preferences within the Django admin backend). One option would be to store these e.g. in sqlite3, but given that we already operate a highly performant MySQL cluster (which is also redundant across data centers), it makes sense to use a MariaDB/MySQL database. This means we need to work with the DBAs to create database(s), manage grants and sort out backups.

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone
OpenNone
OpenSLyngshede-WMF
ResolvedNone
OpenNone
Opentaavi
OpenNone
Resolved Marostegui
ResolvedAndrew
Resolved Marostegui
ResolvedAndrew
DeclinedAndrew
ResolvedAndrew
ResolvedAndrew
ResolvedLadsgroup
DuplicateNone
Resolved Bstorm
DeclinedNone
Resolvedtaavi
ResolvedJdforrester-WMF
DeclinedNone
Openjijiki
OpenNone
OpenFeatureNone
StalledFeatureNone
OpenFeatureSLyngshede-WMF
OpenNone
OpenAndrew
OpenSLyngshede-WMF
OpenABran-WMF
Resolvedtaavi
ResolvedPRODUCTION ERRORTgr
OpenNone
Resolvedbd808
Resolvedyuvipanda
Resolvedbd808
Resolvedbd808
Resolvedbd808
Opentaavi
Resolvedtaavi
DeclinedNone
OpenNone
ResolvedSLyngshede-WMF
Resolved Marostegui

Event Timeline

We need two databases, one for production and one for staging.

The size of the databases are not expected to exceed a few hundred MB.

Marostegui added a project: DBA.
Marostegui subscribed.

Do you have any expected size/growth for those databases?
Amount of writes/reads (roughly)?
Also, the hosts should be able to connect to the dbproxy* hosts, as we route our connections via them.
ie: m5-master.eqiad.wmnet

Growth is expected to be very low, as in 10 - 20 MB per month at the most. Similarly writes/reads will be pretty low as well as users will only need to interact with the system to request access to new systems, or approve/decline requests. Current estimate is at most 100 read requests per minute, when users interact with the system, then large periods of time with little to no reads. Write are expected to be lower.

As development processes I expect to be able to deliver better estimates.

Access via dbproxy is fine, whatever ever the DBA deems best.

That seems fine indeed.
How many users would you need? One with all privileges? one for writes and another one for reads?

Just one user with all privileges. The application is based on Django and will need to be able to manage schema migration internally. I don't think it makes sense to split read and write operation into two users for our use case.

But different users for the production and stage databases.

But different users for the production and stage databases.

That makes sense.

Just one user with all privileges. The application is based on Django and will need to be able to manage schema migration internally. I don't think it makes sense to split read and write operation into two users for our use case.

Are you able to figure out which ones those are?
Obviously I would guess: INSERT, DELETE, UPDATE, SELECT, ALTER, CREATE?

The documentation mostly says "all", but CREATE, ALTER, INDEX, SELECT, UPDATE, INSERT, DELETE, REFERENCES should be the minimum.

Django do prefer that the database is create as utf8, but I suspect that's default: https://docs.djangoproject.com/en/3.2/ref/databases/#creating-your-database

That's ok - any preferred database(s) and user(s) name?

I'm think something like "idm" or identitymanager for the user. Database name is "chefs choice" :-)

We should probably go for something like:

idm
idm_staging

And same for the database name.

Cool, I will try to get it in place tomorrow :)

Mentioned in SAL (#wikimedia-operations) [2022-11-03T07:14:59Z] <marostegui> Create idm and idm_staging databases on m5 T320426

Change 852728 had a related patch set uploaded (by Marostegui; author: Marostegui):

[operations/puppet@production] production-m5.sql.erb: Add idm and idm_staging users

https://gerrit.wikimedia.org/r/852728

I have created both databases on m5 (m5-master.eqiad.wmnet)
The users are also there and avaiable:

root@cumin1001:~# mysql --ssl-verify-server-cert=false -uidm -p -h m5-master.eqiad.wmnet idm
Enter password:
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12237022
Server version: 10.4.26-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

idm@m5-master.eqiad.wmnet[idm]> use idm_staging;
ERROR 1044 (42000): Access denied for user 'idm'@'10.64.48.43' to database 'idm_staging'
idm@m5-master.eqiad.wmnet[idm]> Ctrl-C -- exit!
Aborted
root@cumin1001:~# mysql --ssl-verify-server-cert=false -uidm_staging -p -h m5-master.eqiad.wmnet idm_staging
Enter password:
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12237049
Server version: 10.4.26-MariaDB-log MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

idm_staging@m5-master.eqiad.wmnet[idm_staging]> Ctrl-C -- exit!

@SLyngshede-WMF I have left the passwords for you to productionize them on our private repo at: puppetmaster1001:/home/slyngshede/idm

Change 852728 merged by Marostegui:

[operations/puppet@production] production-m5.sql.erb: Add idm and idm_staging users

https://gerrit.wikimedia.org/r/852728

I am closing this for now, reopen if you find issues.