Page MenuHomePhabricator

Configure MariaDB database for DataHub on an-coord1001
Closed, ResolvedPublic

Description

We need a database for the DataHub MVP so it makes sense to use the existing MariaDB instance on an-coord1001.

The initial creation of the database user and schema was done for the prototype here: T299703#7659244

There are two database init scripts provided, one for mysql and one for mariadb:

We can use either one. The only difference is the mysql version creates a metadata_index table, which according to DataHub slack is no longer used. (TODO: send a PR removing it from the script)

Event Timeline

BTullis triaged this task as High priority.
BTullis moved this task from Next Up to In Progress on the Data-Catalog board.
BTullis moved this task from Next Up to In Progress on the Data-Engineering-Kanban board.

Change 769993 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Allow access to MariaDB analytics-meta from Kubernetes pods

https://gerrit.wikimedia.org/r/769993

I have created a CR that allows the wikikube and staging Kubernetes clusters access to the MariaDB port running on an-coord100[1-2].
https://gerrit.wikimedia.org/r/c/operations/puppet/+/769993

I will create the datahub database and its user account manually on an-coord1001.
However I will try using the mysql-setup-job job to create the schema, if possible. This way we will have a mechanism to apply schema updates automatically whenever the source file changes.

Change 769993 merged by Btullis:

[operations/puppet@production] Allow access to MariaDB analytics-meta from Kubernetes pods

https://gerrit.wikimedia.org/r/769993

I have created the user manually on an-coord1001.

image.png (162×901 px, 18 KB)

The password is in pwstore.

@BTullis you could instead put the password in puppet private repo, and then use that to render it in the configs you need. IIRC You can also use this in puppet to render secrets into values files that are only available on the deployment host, and will be available in your helm templates.

@BTullis you could instead put the password in puppet private repo, and then use that to render it in the configs you need.

Thanks, yes I do intend to use this method and when it comes to it I will transfer the password from pwstore to puppet (to avoid duplication).

I have now moved this password to the private repo, where it is in: hieradata/role/common/deployment_server/kubernetes.yaml
The entry in pwstore has been deleted.

Change 774458 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Use test coordinator for staging datahub deploy

https://gerrit.wikimedia.org/r/774458

Change 774458 merged by Btullis:

[operations/puppet@production] Use test coordinator for staging datahub deploy

https://gerrit.wikimedia.org/r/774458

Change 777329 had a related patch set uploaded (by Btullis; author: Btullis):

[operations/puppet@production] Allow kikikube staging pods to access the analytics-meta test instance

https://gerrit.wikimedia.org/r/777329

Change 777329 merged by Btullis:

[operations/puppet@production] Allow kikikube staging pods to access the analytics-meta test instance

https://gerrit.wikimedia.org/r/777329