Page MenuHomePhabricator

Consider disabling automatic topic creation in main-kafka
Closed, DeclinedPublic

Description

Automatic topic creation is a handy feature as it reduces the number of manual operational tasks involved in adding new Kafka use-cases, new streams, new jobs and new Change-Prop rules. However, even a little software bug can result in creating topics in a loop, that could result in a severe outage of the whole Kafka cluster.

We should think of a way to puppetize Kafka topics and disable automatic topic creation in a way that would be simultaneously handy for operators and safeguard us from software bugs.

Event Timeline

fdans triaged this task as Medium priority.Jul 12 2018, 4:38 PM
fdans moved this task from Incoming to Kafka Work on the Analytics board.

I think a good balance between safety and ease of use would be if kafka could limit the maximum amount of topics allowed. This way we can leave autocreation on and protect kafka at the same time. Having a maximum amount of topics allowed would also make heap sizing easier I believe.

I think a good balance between safety and ease of use would be if kafka could limit the maximum amount of topics allowed. This way we can leave autocreation on and protect kafka at the same time. Having a maximum amount of topics allowed would also make heap sizing easier I believe.

I tried to look for a hard limit in maximum number of topics but didn't find any :(

I think a good balance between safety and ease of use would be if kafka could limit the maximum amount of topics allowed. This way we can leave autocreation on and protect kafka at the same time. Having a maximum amount of topics allowed would also make heap sizing easier I believe.

I tried to look for a hard limit in maximum number of topics but didn't find any :(

Thanks for taking a look! I think it'd be useful to ask upstream if they are willing to implement such limit. For the task at hand I'd say disabling autocreation sounds good as an interim measure, until it becomes a problem in practice.

The caveat with the maximum number of topics is that Kafka has no hard limit on it because it depends on zookeeper, so effectively it can support as many topics as zk can support znodes.

I'm also +1 on disabling topic auto-creation for now.

The caveat with the maximum number of topics is that Kafka has no hard limit on it because it depends on zookeeper, so effectively it can support as many topics as zk can support znodes.

Sure but this is a resource threshold, nothing would prevent kafka to just keep a counter of topics managed and avoid creating more if needed..

I'm also +1 on disabling topic auto-creation for now.

@Ottomata what do you think? We could be smart and force puppet to create topics if needed but it seems a bit of an overhead to manage.

One thing that might be good to do is to limit (via ACLs) the users that can create topics. We could, for example, leave auto-creation enabled and then allow operators and mirror maker only to create new topics if needed. The main problem though is that the API to limit topic creation in the ACLs seems not there yet :(

One thing that might be good to do is to limit (via ACLs) the users that can create topics.

I started commenting yesterday with this idea too, but then stopped because I'm not sure how it helps us! change-prop is what most would benefit from being able to create new topics, since it has all its different queues. But if we had set up an ACL to allow change-prop to create topics, we'd still have had the same outage.

I think if we want to go the ACL route, we probably should go all the way (on main clusters) and restrict everything unless authenticated.

As for disabling auto topic creation in general (even if authenticated)...I'm not excited about it. Managing this in puppet sounds pretty annoying (are we going to enter every possible change-prop topic?). Perhaps change-prop could create its topics using a separate (authenticated) management tool, instead of the runtime processes? Or maybe a more generic tool that manages all topics might be worth considering as part of Modern Event Platform. We'll be building a schema registry type service that will manage the schema -> topic mappings. Perhaps it could handle creation of the topics?

Bah, I dunno, a generic tool will be more long term. Not sure what is best here. If y'all want to turn auto creation off, I'd be ok with it.

mforns subscribed.

@elukey we were here in grosking and agreed to look into this to be sure there are no action items left. before closing.

I think we should close. Maybe we'll do this one day if we have a really solid 'stream (+topic) config' system, but I doubt we'll do it before that.