Our kafka-main cluster is multi-tenant, supporting 3 used-cases now: ChangeProp, JobQueue and EventStreams with more potential use-cases to be added to it. EventStreams and Change-Prop being colocated makes total sense, since these 2 services share the same set of events and using the same Kafka cluster allows us to avoid relying on MirrorMaker reducing the number of moving parts.
However, JobQueue uses a completely separate set of events, and in general, has more strict requirements for stability and uptime since it's an essential part of the Mediawiki Core. Right now a bug in less important services which could bring down the kafka main cluster also brings down the JobQueue (as it happened in this incident).
We should think whether it's possible to completely separate the JobQueue on Kafka level from other services. The only option I see currently is to split the cluster, however, it's might be possible to employ less drastic solutions, like using Kafka ACLs to prohibit services from interacting with other services topics.