Page MenuHomePhabricator

Remove the need for karapace by using the schema registry built into DataHub
Open, HighPublic

Description

When we initially deployed DataHub it had a hard dependency on Confluent's Schema Registry.
We cannot use this software, owing to its incompatible licence.

We worked around this issue by exploying Karapace, which is a drop-in replacement for Schema Registry.

However, in the time since we did that work, DataHub have acted upon our request to remove this depndency.

They have integrated their own schema registry solution with the DataHub GMS component.
There is a video about how to use this feature here: https://www.youtube.com/watch?v=6VgIBlQppZ0 and some slides here: https://docs.google.com/presentation/d/1jem5d5lDaJdF4r6UX_rDJOGc6nMxbNBMK8CtKp4sT0k/edit#slide=id.g1cc8bf3bf5f_4_0

We would need to set the SCHEMA_REGISTRY_TYPE to INTERNAL as described here: https://datahubproject.io/docs/deploy/environment-vars/#kafka

Event Timeline

Gehel triaged this task as High priority.Fri, May 3, 3:51 PM
Gehel moved this task from Incoming to Scratch on the Data-Platform-SRE board.
Gehel moved this task from Scratch to Quarterly Goals on the Data-Platform-SRE board.