Our multi DC kafka setup works like this:
- Producers prefix topics with their datacenter name, e.g. eqiad.mediawiki.client.error
- Kafka MirrorMaker in local DC consumers from remote DC all topics prefixed with remote DC name.
In this way, eqiad topics -> codfw, and codfw topics -> eqiad, and all datacenter prefixed topics are available in each local Kafka cluster. This allows consumers that need all messages from a stream to consume from a single local Kafka cluster.
Kafka logging clusters were not set up this way, because they were expected to have only a single consumer: logstash. Logstash is configured (correct?) to consume from Kafka logging clusters in each DC and product to the local ELK stack.
We should set up proper cross DC mirroring for Kafka logging so that consumers don't have to consume from multiple remote Kafka clusters.
We can set up MirrorMaker to do this without altering the existent logstash consumer setup. Later (if desired), logstash can be reconfigured to consume only from the local DC.