In T123954, we have been investigating options for doing cross-DC mirroring of messages in Kafka. On January 21, @Joe, @paravoid, @elukey, @mobrovac and I met and decided that the most ideal solution would be to do what LinkedIn describes here: https://engineering.linkedin.com/kafka/running-kafka-scale. That is, our main Kafka setups in each DC would consist of 2 separate Kafka clusters. The first 'local' cluster is produced to by producers local to a DC. The second 'aggregate' cluster would mirror all data from from clusters in external DCs. Internal consumers could consume from this aggregate cluster in order to receive all messages from all producers in all DCs.
We talked about several ways to scale this infrastructure into the future. We don't technically need this local+aggregate architecture now, but having it allows us to design topics and applications in a more future proof way.
So! I'm requesting 4 more Kafka broker nodes to be ordered, 2 in eqiad and 2 in codfw. They can have the same specs as the nodes chosen in T114191.
Thanks!
-Ao