We maintains a restbase-dev cluster that is used to test new restbase features under conditions similar to production. In order to mimic the environment, there is a dev installation of change-prop, that listens to kafka events from production, samples them according to the size difference between dev and produciton cluster and updates RESTBase.
The dev change-prop is configured not to post any events into production kafka, however it still connects to it, which, together with a lacking in the deployment of this installations, have caused a severe outage for the main-Kafka cluster in eqiad.
In order to prevent this happening in the future, we need to separate the dev instance as much as possible by setting up a single-node Kafka instance in dev and running a mirror-maker from eqiad and codfw to mirror change-prop-related events there. The Kafka installation will be much smaller, but with a much lower retention period and only a 500 small events/s I believe that it will not severely affect the dev environment memory/disk/cpu - wise - the cluster has machines with 64 gigs of memory and kafka will likely require only one gig on a single machine.