Page MenuHomePhabricator

Kafka Connect development work
Closed, DuplicatePublic


Kafka Connect is part of the Apache Kafka project. It is a distributed and pluggable system for getting data into and out of Kafka. We expect Kafka Connect to have many use cases at WMF.

Kafka Connect can be run in either a 'standalone' mode where it reads its job configuration from files and just runs, or in a clustered mode, where it runs as a service and accepts new job configs via a REST API. We need to figure out how we will run Kafka Connect at WMF. Will we run a single Kafka Connect in Kubernetes? Will we run many standalone Kafka Connect instances? Perhaps both!

Analytics first use case is to replace Camus with Kafka Connect.