Page MenuHomePhabricator

Discovery for Kafka cluster brokers
Open, MediumPublic

Description

Kafka broker lists are currently maintained in puppet, which is not accessible in Helm deployment charts. We could hardcode these there, but it would be nice to have a discovery service for the different Kafka clusters.

The discovery url would only be used for bootstrapping Kafka clients. Kafka clients themselves get the list of brokers to use from Kafka during bootstrap. Having a discovery url would allow us to not have to hardcode broker hostnames in Helm Values.yaml. Instead we could do:

kafka:
  conf:
    metadata.broker.list: kafka.main.discovery.wmnet

Event Timeline

Ottomata triaged this task as Medium priority.Jan 11 2019, 4:30 PM
Ottomata created this task.

Discovery records for kafka would come handy in the logging pipeline case too, namely during datacenter failover to move producers off a given datacenter (and roll-restart the producers)

Sorry, I need some more specifics:

you want to make a dns query, and get as a response the "nearest" kafka cluster in the form of a list of hostnames/ports? Are we talking about discovery SRV records?

I am not sure we can do that at the moment, but there are other ways to have etcd as a source of truth for such data.

No no for me, all I want is an alias for the list of Kafka brokers in a given Kafka cluster. I don't need any DC failover stuff. Perhaps discovery is not the right word here. Round Robin DNS might be enough for me.

Might I suggest that you use a SRV dns record instead? It's more appropriate for enumerating members in a cluster. We use those for etcd discovery.

Change 484509 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/dns@master] Add round robin DNS records for Kafka clusters

https://gerrit.wikimedia.org/r/484509

Once we get it we would need to update Change-Prop, JQ-Change-Prop, EventBus-service, event streams to use the new DNS record.

Might I suggest that you use a SRV dns record instead? It's more appropriate for enumerating members in a cluster. We use those for etcd discovery.

That would be indeed the best solution, but does kafka support that?

Kafka doesn't support SRV. Hence my Round Robin DNS patch. After more discussion with @BBlack, I think I've decided to abandon this idea and just hardcode the Kafka brokers for now. @akosiaris mentioned that they have identified this problem (config management in helm charts) in other areas too, so I'll just hardcode for now and hope for a better future.

Change 484509 abandoned by Ottomata:
Add round robin DNS records for Kafka clusters

https://gerrit.wikimedia.org/r/484509

akosiaris changed the task status from Open to Stalled.Apr 8 2019, 2:37 PM

Stalling until we have some sane solution.

In T213561#4881255, Joe wrote:

Might I suggest that you use a SRV dns record instead?

In T213561#4882509, Ottomata wrote:

Kafka doesn't support SRV.

If that's the preferred approach, is there an [upstream?] ticket somewhere to track?

FYI, we've recently added a 'general.yaml' values support to our helm charts repo. This allows us to render values from puppet. I'd like to accomplish the intent of this task by just rendering the list of Kafka brokers there. That will be good enough, I really just want to DRY up that list.

cc @hnowlan let's see what @Ottomata gets to and see if we can incorporate it into changeprop

I'm okay with using general.yaml in this way but I would like to put the kafka service list under a general hierarchy of services (like "services": {"kafka": [...], "redis": [...]}). We hardcode the lists of our Redis servers for example in changeprop and I'm sure there will be others.

I wonder if something as simple as round robin DNS implemented with multiple A records with the same subdomain would suffice to substantially improve the situation.

In the following example, I've setup 2 A records for the subdomain test.balthazar-rouberol.com, respectively pointing to 192.168.10.1 and 192.168.10.2. Each lookup returns all IPs in a random order.

{F41524349}

~ ❯ dig test.balthazar-rouberol.com +short @ns12.ovh.net
192.168.10.2
192.168.10.1
~ ❯ dig test.balthazar-rouberol.com +short @ns12.ovh.net
192.168.10.1
192.168.10.2

If we had an A record of domain <kafka-cluster>.<site>.wmnet (such as kafka-jumbo.eqiad.wmnet) pointing to each broker IP, we would get a random broker IP at each DNS lookup. Coupled with a retry policy on the client side in the case of an unreachable host, this would allow us to avoid committing changes such as https://gerrit.wikimedia.org/r/c/operations/puppet/+/965159 every time we add/remove a broker from the cluster. We'd only need to add/remove a record in the https://gerrit.wikimedia.org/r/admin/repos/operations/dns,general repository, which would be reflected in subsequent DNS lookups, after the TTL.

We'd also be more resilient in the face of a broker reimaging.

This approach would work for kafka, but really any non-LVS-ed service. As there's no load-balancer and healthchecks involved, we still might get the DNS resolved to an IP associated to a host that's being rebooted/reimaged/etc, but client retries should get us to a IP that "works".

Happy to hear your thoughts.

Edit: this is what Kubernetes does with headless services: no virtual IP, and service name DNS resolution yields as many A records as there are pods deployed for the service (the actual pod IPs).

Gehel added a project: Data-Platform-SRE.
Gehel subscribed.

Re-opening after discussion with @brouberol, having better auto discovery is still interesting.

Pasting some relevant discussion points from slack:

@brouberol

I was referring to https://cwiki.apache.org/confluence/display/KAFKA/KIP-302+-+Enable+Kafka+clients+to+use+all+DNS+resolved+IP+addresses, but now that I think about it, this is a client configuration, not a broker one, so we might be able to use it

Seems like the client.dns.lookup option was only added to librdkakfa 4 months ago https://github.com/confluentinc/librdkafka/commit/961946e55fb3f89eb782d4011af4bf5cd3c31f17

@Ottomata

And we recently upgraded node rdkafka on Eventgate and change prop and eventstreams

@brouberol

As this is a tunable that changes how the client resolves the bootstrap broker IP

@Ottomata

That might be the way to go then, and we could roll it out and enable it incrementally

So, let's create the DNS entries and try this with librdkafka clients (eventgate, etc!)