Page MenuHomePhabricator

Mirror topics from main Kafka clusters (from main-eqiad) into jumbo-eqiad
Closed, ResolvedPublic8 Estimated Story Points

Description

The new Kafka jumbo cluster is up! We need to be careful about moving clients over, but there's no reason we can't start mirroring the topics from the main Kafka clusters now.

We'll first need to puppetize Kafka MirrorMaker using the new kafka profile format.

Event Timeline

Change 384586 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Set up Kafka MirrorMaker from main -> jumbo in eqiad

https://gerrit.wikimedia.org/r/384586

Change 384602 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Small refactor for some kafka classes to ease creation of mirror maker profile

https://gerrit.wikimedia.org/r/384602

Change 384602 merged by Elukey:
[operations/puppet@production] confluent::kafka: refactor existing code with a commons class

https://gerrit.wikimedia.org/r/384602

@fgiunchedi

Ok, got a different prometheus jmx exporter config file up for Kafka Mirror Maker. (I'm using this Phab task instead of T175344 now, since we aren't doing a generic thing anymore.)

https://gerrit.wikimedia.org/r/#/c/384586/13/modules/profile/files/kafka/mirror_maker_prometheus_jmx_exporter.yaml

How's https://gist.github.com/ottomata/cd97637e4c2b77ffe8d8d0147c461400 look?

I decided to use topic=all for the labels for the overall topic metrics. This way the metric name and labels are the same for both.

Ah, but I don't know what these ones are or where they are coming from!

kafka_consumer_consumer_fetch_manager_metrics_test1_0{client_id="kafka-mirror-k1_to_k2-0",topic="__all__",} 0.0
kafka_consumer_consumer_fetch_manager_metrics_test1_1{client_id="kafka-mirror-k1_to_k2-1",topic="__all__",} 0.0
kafka_consumer_consumer_fetch_manager_metrics_test2_0{client_id="kafka-mirror-k1_to_k2-0",topic="__all__",} 0.0

GRRRRR. I don't want topics in the metric name!

In T175344 @fgiunchedi wrote:

In this particular case for example aggregations on the metric (e.g. sum()) will return wrong results (each topic plus the sum of all topics, summed)

Hm, perhaps what I've done with the topic="all" then is not good. I can change this to put all_topics in the metric name instead.

Change 384586 merged by Ottomata:
[operations/puppet@production] Set up Kafka MirrorMaker from main -> jumbo in eqiad

https://gerrit.wikimedia.org/r/384586

Change 386640 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Disable kafka MirrorMaker on kafka-jumbo

https://gerrit.wikimedia.org/r/386640

Change 386640 merged by Ottomata:
[operations/puppet@production] Disable kafka MirrorMaker on kafka-jumbo

https://gerrit.wikimedia.org/r/386640

Change 386648 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Run MirrorMaker on analytics Kafka hosts to mirror main -> jumbo

https://gerrit.wikimedia.org/r/386648

Change 386648 merged by Ottomata:
[operations/puppet@production] Run MirrorMaker on analytics Kafka hosts to mirror main -> jumbo

https://gerrit.wikimedia.org/r/386648

Mentioned in SAL (#wikimedia-operations) [2017-10-26T16:58:34Z] <ottomata> now mirroring main Kafka cluster topics to jumbo Kafka cluster, with MirrorMaker instances running on analytics-eqiad broker nodes. https://phabricator.wikimedia.org/T177216

Mentioned in SAL (#wikimedia-analytics) [2017-10-26T16:58:41Z] <ottomata> now mirroring main Kafka cluster topics to jumbo Kafka cluster,  with MirrorMaker instances running on analytics-eqiad broker nodes. https://phabricator.wikimedia.org/T177216

Crap crackers.

From https://kafka.apache.org/documentation/#upgrade_11_0_0

if your brokers are older than 0.10.0, you must upgrade all the brokers in the Kafka cluster before upgrading your clients.

This means that we can't run the 0.11 MirrorMaker version to consume from main Kafka because it runs 0.9

All of this profile::kafka::mirror work can't be used until we upgrade the main Kafka clusters! :p

Instead, I've puppetized another MirrorMaker instance on the analytics broker nodes. We'll have to keep some of these nodes online even after we turn off the analytics Kafka cluster, until we finally upgrade main Kafka to >= 0.11.

Change 387581 had a related patch set uploaded (by Ottomata; owner: Ottomata):
[operations/puppet@production] Set main-eqiad -> jumbo->eqiad mirror maker heap to 512M

https://gerrit.wikimedia.org/r/387581

Change 387581 merged by Ottomata:
[operations/puppet@production] Set main-eqiad -> jumbo->eqiad mirror maker heap to 512M

https://gerrit.wikimedia.org/r/387581