We will need to define four separate components in order to run DataHub in Kubernetes:
- DataHub Metadata Server (GMS)
- DataHub Frontend
- MCE Consumer Job
- MAE Consumer Job
Each of these components will use its own docker image, built using the deployment pipline.
We also have three setup tasks that pre-populate the MySQL, Elasticsearch, and Kafka data stores. These tasks should be configured not to run in production, but they will be required in a development environment.
All of these components of DataHub are stateless.