===Conclusions===
OpenMetadata is a relatively new offering from Uber's data team. They seem to emphasize a concise set of features and good UX, along with connectivity with more modern data stacks.
====Pros====
* Incredibly responsive community, on slack and github
* Fast and simple UI
* very easy to install
====Cons====
* Hive connector is in a very early alpha stage
* MySQL 8+ required, hard for us to set up here
* too much manual metadata wrangling allowed by UI
===Run===
To run this, make sure the three services below are running, tunnel and visit http://localhost:8585
```
# everything was set up under user milimetric, but should be easily copy-able
ssh an-test-client1001.eqiad.wmnet
systemctl --user start mysql8
systemctl --user start opensearch
systemctl --user start openmetadata
ssh -N an-test-client1001.eqiad.wmnet -L 8585:127.0.0.1:8585
```
===Steps to Reproduce Installation===
[] MySQL 8+ (the versions of MariaDB that we support are not compatible)
** [] [[ https://dev.mysql.com/downloads/ | find an archive ]] like mysql-server_8.0.28-1debian10_amd64.deb-bundle.tar
** [] extract with `dpkg-deb -x`
** [] create database "openmetadata" and user with access
** [] create a systemd unit, could look like:
```
[Unit]
Description=MySQL server, version 8
[Service]
Type=simple
Environment=LD_LIBRARY_PATH=/srv/data-catalog-tmp/mysql-server-chroot/usr/lib/x86_64-linux-gnu
ExecStart=/srv/data-catalog-tmp/mysql-server-chroot/usr/sbin/mysqld --defaults-file=/srv/data-catalog-tmp/mysql-server-chroot/etc/mysql/mysql.cnf
[Install]
WantedBy=multi-user.target
```
[] OpenSearch (to satisfy ElasticSearch req.)
** [] [[ https://opensearch.org/docs/latest/opensearch/install/tar/ | download ]] and extract
** [] in ./config/opensearch.yml add plugins.security.disabled: true
** [] create a systemd unit, ours looks like:
```
[Unit]
Description=OpenSearch server, running wih security plugin disabled (pw admin:admin)
[Service]
Type=simple
Environment=LD_LIBRARY_PATH=/srv/data-catalog-tmp/mysql-server-chroot/usr/lib/x86_64-linux-gnu
ExecStart=/home/milimetric/opensearch-1.2.4/bin/opensearch
[Install]
WantedBy=multi-user.target
```
[] [[ https://github.com/open-metadata/OpenMetadata/releases | download ]] and [[ https://docs.open-metadata.org/install/run-in-production | install ]] OpenMetadata
** [] use Java 11: `export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64`
** [] with mysql running, database and user created, config OpenMetadata to point to it
** [] run ./bootstrap/bootstrap_storage.sh migrate to create tables
** [] point to OpenSearch from config (never dug into this)
** [] systemd unit:
```
[Unit]
Description=OpenMetadata service
[Service]
Type=forking
Environment=JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64
ExecStart=/home/milimetric/openmetadata-0.8.0/bin/openmetadata.sh start
ExecStop=/home/milimetric/openmetadata-0.8.0/bin/openmetadata.sh stop
[Install]
WantedBy=multi-user.target
```
===Metadata Ingestion===
```
conda create -n airflow_py_38 python=3.8
conda activate airflow_py_38
export https_proxy=http://webproxy.eqiad.wmnet:8080
pip install wheel
pip install hmsclient
pip install apache-airflow[hdfs,hive,kerberos]
pip install flask-admin==1.4.0
pip install pyarrow
pip install openmetadata-ingestion[hive,data-profiler]
```
Bug with the Hive connector, I [[ https://github.com/open-metadata/OpenMetadata/issues/2531 | filed ]] and they fixed, hacked around in the meantime.
More problems configuring this, opened an issue and hacked around the limitations.
Finally our ingestion config looked like:
```
current_user = pwd.getpwuid(os.getuid()).pw_name
config = """
{
"source": {
"type": "hive",
"config": {
"database": "wmf",
"host_port": "analytics-hive.eqiad.wmnet",
"service_name": "hive_test_cluster",
"generate_sample_data": "true",
"scheme": "hive",
"query": "select * from {}.{} where year=2022 limit 50",
"data_profiler_enabled": "false",
"data_profiler_offset": "0",
"data_profiler_limit": "50000",
"connect_args": {
"auth": "KERBEROS",
"username": """ + '"' + current_user + '"' + """,
"kerberos_service_name": "hive"
}
}
},
"sink": {
"type": "metadata-rest",
"config": {}
},
"metadata_server": {
"type": "metadata-server",
"config": {
"api_endpoint": "http://localhost:8585/api",
"auth_provider_type": "no-auth"
}
}
}
"""
```
(for the record, the initial notes along with a failed attempt to compile MySQL 8 are here: https://app.slack.com/docs/T024KLHS4/F02V248G4BV?origin_team=T024KLHS4&origin_channel=C02UCDD7FKK, the relevant parts were copied above)