===Conclusions===
Amundsen is a good contender, but ultimately relies on Atlas for good Hive integration, and that would complicate our deployment quite a bit.
====Pros====
* simple architecture of 3 flask services all in python (as opposed to Datahub using java and python)
* ingestion architecture is simple: python scripts or airflow dags that make http api requests
* "social" ui features, like frequent users and owners
* loose coupling means you can use a relational database as the data store rather than neo4j (https://github.com/amundsen-io/amundsenrds)
====Cons====
* seems like the community is losing steam: https://github.com/amundsen-io/amundsen#blog-posts-and-interviews has a flurry of events in 2019/2020 but nothing in 2021
* only supports polling for data updates, unless we also deploy atlas. Push ingest api is on their roadmap
* documentation is somewhat lacking; few ingestion examples, and broken links in docs
* some dependencies are getting out of date: elasticsearch version 6 (v7 was released 2019), nodejs version 12 (v13 was released 2019)
===Run===
* (from T300756#7683747)
* Tunnel with `ssh -N -L 5000:localhost:5000 stat1008.eqiad.wmnet`
* browse http://localhost:5000
===Steps to Reproduce Installation===
* ElasticSearch (we'll use OpenSearch here as well)
* Neo4j with some trouble setting up SSL: T300756#7677142
* Configure and launch all the services as mentioned in documentation and T300756#7673715
===Ingestion===
* Ingest Hive Metastore: T300756#7683747 (second half of that comment)
* Ingest Druid: T300756#7683858
(see [[ https://wikimedia.slack.com/archives/C02UCDD7FKK/p1643920813051479 | slack thread ]])